RNA-antibody fusions and their selection

ABSTRACT

Described herein are methods and reagents for the selection of protein molecules that make use of RNA-protein fusions.

BACKGROUND OF THE INVENTION

This application is a continuation of application, Szostak et al., U.S.Ser. No. 09/007,005, filed Jan. 14, 1998, now U.S. Pat. No. 6,258,358which claims benefit from provisional applications, Szostak et al., U.S.Ser. No. 60/064,491, filed Nov. 6, 1997, now abandoned, and U.S. Ser.No. 60/035,963, filed Jan. 21, 1997, now abandoned.

This invention relates to protein selection methods.

The invention was made with government support under grant F32GM17776-01 and F32 GM17776-02. The government has certain rights in theinvention.

Methods currently exist for the isolation of RNA and DNA molecules basedon their functions. For example, experiments of Ellington and Szostak(Nature 346:818 (1990); and Nature 355:850 (1992)) and Tuerk and Gold(Science 249:505 (1990); and J. Mol. Biol 222:739 (1991)) havedemonstrated that very rare (i.e., less than 1 in 10¹³) nucleic acidmolecules with desired properties may be isolated out of complex poolsof molecules by repeated rounds of selection and amplification. Thesemethods offer advantages over traditional genetic selections in that (i)very large candidate pools may be screened (>10¹⁵), (ii) host viabilityand in vivo conditions are not concerns, and (iii) selections may becarried out even if an in vivo genetic screen does not exist. The powerof in vitro selection has been demonstrated in defining novel RNA andDNA sequences with very specific protein binding functions (see, forexample, Tuerk and Gold, Science 249:505 (1990); Irvine et al., J. Mol.Biol 222:739 (1991); Oliphant et al., Mol. Cell Biol. 9:2944 (1989);Blackwell et al., Science 250:1104 (1990); Pollock and Treisman, Nuc.Acids Res. 18:6197 (1990); Thiesen and Bach, Nuc. Acids Res. 18:3203(1990); Bartel et al., Cell 57:529 (1991); Stormo and Yoshioka, Proc.Natl. Acad. Sci. USA 88:5699 (1991); and Bock et al., Nature 355:564(1992)), small molecule binding functions (Ellington and Szostak, Nature346:818 (1990); Ellington and Szostak, Nature 355:850 (1992)), andcatalytic functions (Green et al., Nature 347:406 (1990); Robertson andJoyce, Nature 344:467 (1990); Beaudry and Joyce, Science 257:635 (1992);Bartel and Szostak, Science 261:1411 (1993); Lorsch and Szostak, Nature371:31-36 (1994); Cuenoud and Szostak, Nature 375:611-614 (1995);Chapman and Szostak, Chemistry and Biology 2:325-333 (1995); and Lohseand Szostak, Nature 381:442-444 (1996)). A similar scheme for theselection and amplification of proteins has not been demonstrated.

SUMMARY OF THE INVENTION

The purpose of the present invention is to allow the principles of invitro selection and in vitro evolution to be applied to proteins. Theinvention facilitates the isolation of proteins with desired propertiesfrom large pools of partially or completely random amino acid sequences.In addition, the invention solves the problem of recovering andamplifying the protein sequence information by covalently attaching themRNA coding sequence to the protein molecule.

In general, the inventive method consists of an in vitro or in situtranscription/ translation protocol that generates protein covalentlylinked to the 3′ end of its own mRNA, i.e., an RNA-protein fusion. Thisis accomplished by synthesis and in vitro or in situ translation of anmRNA molecule with a peptide acceptor attached to its 3′ end. Onepreferred peptide acceptor is puromycin, a nucleoside analog that addsto the C-terminus of a growing peptide chain and terminates translation.In one preferred design, a DNA sequence is included between the end ofthe message and the peptide acceptor which is designed to cause theribosome to pause at the end of the open reading frame, providingadditional time for the peptide acceptor (for example, puromycin) toaccept the nascent peptide chain before hydrolysis of the peptidyl-tRNAlinkage.

If desired, the resulting RNA-protein fusion allows repeated rounds ofselection and amplification because the protein sequence information maybe recovered by reverse transcription and amplification (for example, byPCR amplification as well as any other amplification technique,including RNA-based amplification techniques such as 3SR or TSA). Theamplified nucleic acid may then be transcribed, modified, and in vitroor in situ translated to generate mRNA-protein fusions for the nextround of selection. The ability to carry out multiple rounds ofselection and amplification enables the enrichment and isolation of veryrare molecules, e.g., one desired molecule out of a pool of 10¹⁵members. This in turn allows the isolation of new or improved proteinswhich specifically recognize virtually any target or which catalyzedesired chemical reactions.

Accordingly, in a first aspect, the invention features a method forselection of a desired protein, involving the steps of: (a) providing apopulation of candidate RNA molecules, each of which includes atranslation initiation sequence and a start codon operably linked to acandidate protein coding sequence and each of which is operably linkedto a peptide acceptor at the 3′ end of the candidate protein codingsequence; (b) in vitro or in situ translating the candidate proteincoding sequences to produce a population of candidate RNA-proteinfusions; and (c) selecting a desired RNA-protein fusion, therebyselecting the desired protein.

In a related aspect, the invention features a method for selection of aDNA molecule which encodes a desired protein, involving the steps of:(a) providing a population of candidate RNA molecules, each of whichincludes a translation initiation sequence and a start codon operablylinked to a candidate protein coding sequence and each of which isoperably linked to a peptide acceptor at the 3′ end of the candidateprotein coding sequence; (b) in vitro or in situ translating thecandidate protein coding sequences to produce a population of candidateRNA-protein fusions; (c) selecting a desired RNA-protein fusion; and (d)generating from the RNA portion of the fusion a DNA molecule whichencodes the desired protein.

In another related aspect, the invention features a method for selectionof a protein having an altered function relative to a reference protein,involving the steps of: (a) producing a population of candidate RNAmolecules from a population of DNA templates, the candidate DNAtemplates each having a candidate protein coding sequence which differsfrom the reference protein coding sequence, the RNA molecules eachcomprising a translation initiation sequence and a start codon operablylinked to the candidate protein coding sequence and each being operablylinked to a peptide acceptor at the 3′ end; (b) in vitro or in situtranslating the candidate protein coding sequences to produce apopulation of candidate RNA-protein fusions; and (c) selecting anRNA-protein fusion having an altered function, thereby selecting theprotein having the altered function.

In yet another related aspect, the invention features a method forselection of a DNA molecule which encodes a protein having an alteredfunction relative to a reference protein, involving the steps of: (a)producing a population of candidate RNA molecules from a population ofcandidate DNA templates, the candidate DNA templates each having acandidate protein coding sequence which differs from the referenceprotein coding sequence, the RNA molecules each comprising a translationinitiation sequence and a start codon operably linked to the candidateprotein coding sequence and each being operably linked to a peptideacceptor at the 3′ end; (b) in vitro or in situ translating thecandidate protein coding sequences to produce a population ofRNA-protein fusions; (c) selecting an RNA-protein fusion having analtered function; and (d) generating from the RNA portion of the fusiona DNA molecule which encodes the protein having the altered function.

In yet another related aspect, the invention features a method forselection of a desired RNA, involving the steps of: (a) providing apopulation of candidate RNA molecules, each of which includes atranslation initiation sequence and a start codon operably linked to acandidate protein coding sequence and each of which is operably linkedto a peptide acceptor at the 3′ end of the candidate protein codingsequence; (b) in vitro or in situ translating the candidate proteincoding sequences to produce a population of candidate RNA-proteinfusions; and (c) selecting a desired RNA-protein fusion, therebyselecting the desired RNA.

In preferred embodiments of the above methods, the peptide acceptor ispuromycin; each of the candidate RNA molecules further includes a pausesequence or further includes a DNA or DNA analog sequence covalentlybonded to the 3′ end of the RNA; the population of candidate RNAmolecules includes at least 10⁹, preferably, at least 10¹⁰, morepreferably, at least 10¹¹, 10¹², or 10¹³, and, most preferably, at least10¹⁴ different RNA molecules; the in vitro translation reaction iscarried out in a lysate prepared from a eukaryotic cell or portionthereof (and is, for example, carried out in a reticulocyte lysate orwheat germ lysate); the in vitro translation reaction is carried out inan extract prepared from a prokaryotic cell (for example, E. coli) orportion thereof; the selection step involves binding of the desiredprotein to an immobilized binding partner;

the selection step involves assaying for a functional activity of thedesired protein; the DNA molecule is amplified; the method furtherinvolves repeating the steps of the above selection methods; the methodfurther involves transcribing an RNA molecule from the DNA molecule andrepeating steps (a) through (d); following the in vitro translatingstep, the method further involves an incubation step carried out in thepresence of 50-100 mM Mg²⁺; and the RNA-protein fusion further includesa nucleic acid or nucleic acid analog sequence positioned proximal tothe peptide acceptor which increases flexibility.

In other related aspects, the invention features an RNA-protein fusionselected by any of the methods of the invention; a ribonucleic acidcovalently bonded though an amide bond to an amino acid sequence, theamino acid sequence being encoded by the ribonucleic acid; and aribonucleic acid which includes a translation initiation sequence and astart codon operably linked to a candidate protein coding sequence, theribonucleic acid being operably linked to a peptide acceptor (forexample, puromycin) at the 3′ end of the candidate protein codingsequence.

In a second aspect, the invention features a method for selection of adesired protein or desired RNA through enrichment of a sequence pool.This method involves the steps of: (a) providing a population ofcandidate RNA molecules, each of which includes a translation initiationsequence and a start codon operably linked to a candidate protein codingsequence and each of which is operably linked to a peptide acceptor atthe 3′ end of the candidate protein coding sequence; (b) in vitro or insitu translating the candidate protein coding sequences to produce apopulation of candidate RNA-protein fusions; (c) contacting thepopulation of RNA-protein fusions with a binding partner specific foreither the RNA portion or the protein portion of the RNA-protein fusionunder conditions which substantially separate the bindingpartner-RNA-protein fusion complexes from unbound members of thepopulation; (d) releasing the bound RNA-protein fusions from thecomplexes; and (e) contacting the population of RNA-protein fusions fromstep (d) with a binding partner specific for the protein portion of thedesired RNA-protein fusion under conditions which substantially separatethe binding partner-RNA-protein fusion complex from unbound members ofsaid population, thereby selecting the desired protein and the desiredRNA.

In preferred embodiments, the method further involves repeating steps(a) through (e). In addition, for these repeated steps, the same ordifferent binding partners may be used, in any order, for selectiveenrichment of the desired RNA-protein fusion. In another preferredembodiment, step (d) involves the use of a binding partner (for example,a monoclonal antibody) specific for the protein portion of the desiredfusion. This step is preferably carried out following reversetranscription of the RNA portion of the fusion to generate a DNA whichencodes the desired protein. If desired, this DNA may be isolated and/orPCR amplified. This enrichment technique may be used to select a desiredprotein or may be used to select a protein having an altered functionrelative to a reference protein.

In other preferred embodiments of the enrichment methods, the peptideacceptor is puromycin; each of the candidate RNA molecules furtherincludes a pause sequence or further includes a DNA or DNA analogsequence covalently bonded to the 3′ end of the RNA; the population ofcandidate RNA molecules includes at least 10⁹, preferably, at least10¹⁰, more preferably, at least 10¹¹, 10¹², or 10¹³, and, mostpreferably, at least 10¹⁴ different RNA molecules; the in vitrotranslation reaction is carried out in a lysate prepared from aeukaryotic cell or portion thereof (and is, for example, carried out ina reticulocyte lysate or wheat germ lysate); the in vitro translationreaction is carried out in an extract prepared from a prokaryotic cellor portion thereof (for example, E. coli); the DNA molecule isamplified; at least one of the binding partners is immobilized on asolid support; following the in vitro translating step, the methodfurther involves an incubation step carried out in the presence of50-100 mM Mg²⁺; and the RNA-protein fusion further includes a nucleicacid or nucleic acid analog sequence positioned proximal to the peptideacceptor which increases flexibility.

In a related aspect, the invention features kits for carrying out any ofthe selection methods described herein.

In a third and final aspect, the invention features a microchip thatincludes an array of immobilized single-stranded nucleic acids, thenucleic acids being hybridized to RNA-protein fusions. Preferably, theprotein component of the RNA-protein fusion is encoded by the RNA.

As used herein, by a “population” is meant more than one molecule (forexample, more than one RNA, DNA, or RNA-protein fusion molecule).Because the methods of the invention facilitate selections which begin,if desired, with large numbers of candidate molecules, a “population”according to the invention preferably means more than 10⁹ molecules,more preferably, more than 10¹¹, 10¹², or 10¹³ molecules, and, mostpreferably, more than 10¹³ molecules.

By “selecting” is meant substantially partitioning a molecule from othermolecules in a population. As used herein, a “selecting” step providesat least a 2-fold, preferably, a 30-fold, more preferably, a 100-fold,and, most preferably, a 1000-fold enrichment of a desired moleculerelative to undesired molecules in a population following the selectionstep. As indicated herein, a selection step may be repeated any numberof times, and different types of selection steps may be combined in agiven approach.

By a “protein” is meant any two or more naturally occurring or modifiedamino acids joined by one or more peptide bonds. “Protein” and “peptide”are used interchangeably herein.

By “RNA” is meant a sequence of two or more covalently bonded, naturallyoccurring or modified ribonucleotides. One example of a modified RNAincluded within this term is phosphorothioate RNA.

By a “translation initiation sequence” is meant any sequence which iscapable of providing a functional ribosome entry site. In bacterialsystems, this region is sometimes referred to as a Shine-Dalgamosequence.

By a “start codon” is meant three bases which signal the beginning of aprotein coding sequence. Generally, these bases are AUG (or ATG);however, any other base triplet capable of being utilized in this mannermay be substituted.

By “covalently bonded” to a peptide acceptor is meant that the peptideacceptor is joined to a “protein coding sequence” either directlythrough a covalent bond or indirectly through another covalently bondedsequence (for example, DNA corresponding to a pause site).

By a “peptide acceptor” is meant any molecule capable of being added tothe C-terminus of a growing protein chain by the catalytic activity ofthe ribosomal peptidyl transferase function. Typically, such moleculescontain (i) a nucleotide or nucleotide-like moiety (for example,adenosine or an adenosine analog (di-methylation at the N-6 aminoposition is acceptable)), (ii) an amino acid or amino acid-like moiety(for example, any of the 20 D- or L-amino acids or any amino acid analogthereof (for example, O-methyl tyrosine or any of the analogs describedby Elhman et al., Meth. Enzymol. 202:301, 1991), and (iii) a linkagebetween the two (for example, an ester, amide, or ketone linkage at the3′ position or, less preferably, the 2′ position); preferably, thislinkage does not significantly perturb the pucker of the ring from thenatural ribonucleotide conformation. Peptide acceptors may also possessa nucleophile, which may be, without limitation, an amino group, ahydroxyl group, or a sulfhydryl group. In addition, peptide acceptorsmay be composed of nucleotide mimetics, amino acid mimetics, or mimeticsof the combined nucleotide-amino acid structure.

By a peptide acceptor being positioned “at the 3′ end” of a proteincoding sequence is meant that the peptide acceptor molecule ispositioned after the final codon of that protein coding sequence. Thisterm includes, without limitation, a peptide acceptor molecule that ispositioned precisely at the 3′ end of the protein coding sequence aswell as one which is separated from the final codon by interveningcoding or non-coding sequence (for example, a sequence corresponding toa pause site). This term also includes constructs in which coding ornon-coding sequences follow (that is, are 3′ to) the peptide acceptormolecule. In addition, this term encompasses, without limitation, apeptide acceptor molecule that is covalently bonded (either directly orindirectly through intervening nucleic acid sequence) to the proteincoding sequence, as well as one that is joined to the protein codingsequence by some non-covalent means, for example, through hybridizationusing a second nucleic acid sequence that binds at or near the 3′ end ofthe protein coding sequence and that itself is bound to a peptideacceptor molecule.

By an “altered function” is meant any qualitative or quantitative changein the function of a molecule.

By a “pause sequence” is meant a nucleic acid sequence which causes aribosome to slow or stop its rate of translation.

By “binding partner,” as used herein, is meant any molecule which has aspecific, covalent or non-covalent affinity for a portion of a desiredRNA-protein fusion. Examples of binding partners include, withoutlimitation, members of antigen/antibody pairs, protein/inhibitor pairs,receptor/ligand pairs (for example cell surface receptor/ligand pairs,such as hormone receptor/peptide hormone pairs), enzyme/substrate pairs(for example, kinase/substrate pairs), lectin/carbohydrate pairs,oligomeric or heterooligomeric protein aggregates, DNA bindingprotein/DNA binding site pairs, RNA/protein pairs, and nucleic acidduplexes, heteroduplexes, or ligated strands, as well as any moleculewhich is capable of forming one or more covalent or non-covalent bonds(for example, disulfide bonds) with any portion of an RNA-proteinfusion. Binding partners include, without limitation, any of the“selection motifs” presented in FIG. 2.

By a “solid support” is meant, without limitation, any column (or columnmaterial), bead, test tube, microtiter dish, solid particle (forexample, agarose or sepharose), microchip (for example, silicon,silicon-glass, or gold chip), or membrane (for example, the membrane ofa liposome or vesicle) to which an affinity complex may be bound, eitherdirectly or indirectly (for example, through other binding partnerintermediates such as other antibodies or Protein A), or in which anaffinity complex may be embedded (for example, through a receptor orchannel).

The presently claimed invention provides a number of significantadvantages. To begin with, it is the first example of this type ofscheme for the selection and amplification of proteins. This techniqueovercomes the impasse created by the need to recover nucleotidesequences corresponding to desired, isolated proteins (since onlynucleic acids can be replicated). In particular, many prior methods thatallowed the isolation of proteins from partially or fully randomizedpools did so through an in vivo step. Methods of this sort includemonoclonal antibody technology (Milstein, Sci. Amer. 243:66 (1980); andSchultz et al., J. Chem. Engng. News 68:26 (1990)), phage display(Smith, Science 228:1315 (1985); Parmley and Smith, Gene 73:305 (1988);and McCafferty et al., Nature 348:552 (1990)), peptide-lac repressorfusions (Cull et al., Proc. Natl. Acad. Sci. USA 89:1865 (1992)), andclassical genetic selections. Unlike the present technique, each ofthese methods relies on a topological link between the protein and thenucleic acid so that the information of the protein is retained and canbe recovered in readable, nucleic acid form.

In addition, the present invention provides advantages over the stalledtranslation method (Tuerk and Gold, Science 249:505 (1990); Irvine etal., J. Mol. Biol 222:739 (1991); Korman et al., Proc. Natl. Acad. Sci.USA 79:1844-1848 (1982); Mattheakis et al., Proc. Natl. Acad. Sci. USA91:9022-9026 (1994); Mattheakis et al., Meth. Enzymol. 267:195 (1996);and Hanes and Pluckthun, Proc. Natl. Acad. Sci. USA 94:4937 (1997)), atechnique in which selection is for some property of a nascent proteinchain that is still complexed with the ribosome and its mRNA. Unlike thestalled translation technique, the present method does not rely onmaintaining the integrity of an mRNA: ribosome: nascent chain ternarycomplex, a complex that is very fragile and is therefore limiting withrespect to the types of selections which are technically feasible.

The present method also provides advantages over the branched synthesisapproach proposed by Brenner and Lerner (Proc. Natl. Acad. Sci. USA89:5381-5383 (1992)), in which DNA-peptide fusions are generated, andgenetic information is theoretically recovered following one round ofselection. Unlike the branched synthesis approach, the present methoddoes not require the regeneration of a peptide from the DNA portion of afusion (which, in the branched synthesis approach, is generallyaccomplished by individual rounds of chemical synthesis). Accordingly,the present method allows for repeated rounds of selection usingpopulations of candidate molecules. In addition, unlike the branchedsynthesis technique, which is generally limited to the selection offairly short sequences, the present method is applicable to theselection of protein molecules of considerable length.

In yet another advantage, the present selection and directed evolutiontechnique can make use of very large and complex libraries of candidatesequences. In contrast, existing protein selection methods which rely onan in vivo step are typically limited to relatively small libraries ofsomewhat limited complexity. This advantage is particularly importantwhen selecting functional protein sequences considering, for example,that 10¹³ possible sequences exist for a peptide of only 10 amino acidsin length. In classical genetic techniques, lac repressor fusionapproaches, and phage display methods, maximum complexities generallyfall orders of magnitude below 10¹³ members. Large library size alsoprovides an advantage for directed evolution applications, in thatsequence space can be explored to a greater depth around any givenstarting sequence.

The present technique also differs from prior approaches in that theselection step is context-independent. In many other selection schemes,the context in which, for example, an expressed protein is present canprofoundly influence the nature of the library generated. For example,an expressed protein may not be properly expressed in a particularsystem or may not be properly displayed (for example, on the surface ofa phage particle). Alternatively, the expression of a protein mayactually interfere with one or more critical steps in a selection cycle,e.g., phage viability or infectivity, or lac repressor binding. Theseproblems can result in the loss of functional molecules or inlimitations on the nature of the selection procedures that may beapplied.

Finally, the present method is advantageous because it provides controlover the repertoire of proteins that may be tested. In certaintechniques (for example, antibody selection), there exists little or nocontrol over the nature of the starting pool. In yet other techniques(for example, lac fusions and phage display), the candidate pool must beexpressed in the context of a fusion protein. In contrast, RNA-proteinfusion constructs provide control over the nature of the candidate poolsavailable for screening. In addition, the candidate pool size has thepotential to be as high as RNA or DNA pools (˜10¹⁵ members), limitedonly by the size of the in vitro translation reaction performed. And themakeup of the candidate pool depends completely on experimental design;random regions may be screened in isolation or within the context of adesired fusion protein, and most if not all possible sequences may beexpressed in candidate pools of RNA-protein fusions.

Other features and advantages of the invention will be apparent from thefollowing detailed description, and from the claims.

DETAILED DESCRIPTION

The drawings will first briefly be described.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1C are schematic representations of steps involved in theproduction of RNA-protein fusions. FIG. 1A illustrates a sample DNAconstruct for generation of an RNA portion of a fusion. FIG. 1Billustrates the generation of an RNA/puromycin conjugate. And FIG. 1Cillustrates the generation of an RNA-protein fusion.

FIG. 2 is a schematic representation of a generalized selection protocolaccording to the invention.

FIG. 3 is a schematic representation of a synthesis protocol for minimaltranslation templates containing 3′ puromycin. Step (A) shows theaddition of protective groups to the reactive functional groups onpuromycin (5′-OH and NH₂); as modified, these groups are suitablyprotected for use in phosphoramidite based oligonucleotide synthesis.The protected puromycin was attached to aminohexyl controlled pore glass(CPG) through the 2′OH group using the standard protocol for attachmentof DNA through its 3′OH (Gait, Oligonucleotide Synthesis, A PracticalApproach, The Practical Approach Series (IRL Press, Oxford, 1984)). Instep (B), a minimal translation template (termed “43-P”), whichcontained 43 nucleotides, was synthesized using standard RNA and DNAchemistry (Millipore, Bedford, Mass.), deprotected using NH₄OH and TBAF,and gel purified. The template contained 13 bases of RNA at the 5′ endfollowed by 29 bases of DNA attached to the 3′ puromycin at its 5′ OH.The RNA sequence contained (i) a Shine-Dalgarno consensus sequencecomplementary to five bases of 16S rRNA (Stormo et al., Nucleic AcidsResearch 10:2971-2996 (1982); Shine and Dalgarno, Proc. Natl. Acad. Sci.USA 71:1342-1346 (1974); and Steitz and Jakes, Proc. Natl. Acad. Sci.USA 72:4734-4738 (1975)), (ii) a five base spacer, and (iii) a singleAUG start codon. The DNA sequence was dA₂₇dCdCP, where “P” is puromycin.

FIG. 4 is a schematic representation of a preferred method for thepreparation of protected CPG-linked puromycin.

FIG. 5 is a schematic representation showing possible modes ofmethionine incorporation into a template of the invention. As shown inreaction (A), the template binds the ribosome, allowing formation of the70S initiation complex. Fmet tRNA binds to the P site and is base pairedto the template. The puromycin at the 3′ end of the template enters theA site in an intramolecular fashion and forms an amide linkage toN-formyl methionine via the peptidyl transferase center, therebydeacylating the tRNA. Phenol/chloroform extraction of the reactionyields the template with methionine covalently attached. Shown inreaction (B) is an undesired intermolecular reaction of the templatewith puromycin containing oligonucleotides. As before, the minimaltemplate stimulates formation of the 70S ribosome containing fmet tRNAbound to the P site. This is followed by entry of a second template intrans to give a covalently attached methionine.

FIGS. 6A-6H are photographs showing the incorporation of ³⁵S methionine(³⁵S met) into translation templates. FIG. 6A demonstrates magnesium(Mg²⁺) dependence of the reaction. FIG. 6B demonstrates base stabilityof the product; the change in mobility shown in this figure correspondsto a loss of the 5′ RNA sequence of 43-P (also termed “Met template”) toproduce the DNA-puromycin portion, termed 30-P. The retention of thelabel following base treatment was consistent with the formation of apeptide bond between ³⁵S methionine and the 3′ puromycin of thetemplate. FIG. 6C demonstrates the inhibition of product formation inthe presence of peptidyl transferase inhibitors. FIG. 6D demonstratesthe dependence of ³⁵S methionine incorporation on a template codingsequence. FIG. 6E demonstrates DNA template length dependence of ³⁵Smethionine incorporation. FIG. 6F illustrates cis versus trans productformation using templates 43-P and 25-P. FIG. 6G illustrates cis versustrans product formation using templates 43-P and 13-P. FIG. 6Hillustrates cis versus trans product formation using templates 43-P and30-P in a reticulocyte lysate system.

FIGS. 7A-7C are schematic illustrations of constructs for testingpeptide fusion formation and selection. FIG. 7A shows LP77(“ligated-product,” “77” nucleotides long) (also termed, “short myctemplate”) (SEQ ID NO:1). This sequence contains the c-myc monoclonalantibody epitope tag EQKLISEEDL (SEQ ID NO:2) (Evan et al., Mol. CellBiol. 5:3610-3616 (1985)) flanked by a 5′ start codon and a 3′ linker.The 5′ region contains a bacterial Shine-Dalgarno sequence identical tothat of 43-P. The coding sequence was optimized for translation inbacterial systems. In particular, the 5′ UTRs of 43-P and LP77 containeda Shine-Dalgarno sequence complementary to five bases of 16S rRNA(Steitz and Jakes, Proc. Natl. Acad. Sci. USA 72:4734-4738 (1975)) andspaced similarly to ribosomal protein sequences (Stormo et al, NucleicAcids Res. 10:2971-2996 (1982)). FIG. 7B shows LP154 (ligated product,154 nucleotides long) (also termed “long myc template”) (SEQ ID NO:3).This sequence contains the code for generation of the peptide used toisolate the c-myc antibody. The 5′ end contains a truncated version ofthe TMV upstream sequence (designated “TE). This 5′ UTR contained a 22nucleotide sequence derived from the TMV 5′ UTR encompassing twoACAAAUUAC direct repeats (Gallie et al., Nucl. Acids Res. 16:883(1988)). FIG. 7C shows Pool #1 (SEQ ID NO:4), an exemplary sequence tobe used for peptide selection. The final seven amino acids from theoriginal myc peptide were included in the template to serve as the 3′constant region required for PCR amplification of the template. Thissequence is known not to be part of the antibody binding epitope.

FIG. 8 is a photograph demonstrating the synthesis of RNA-proteinfusions using templates 43-P, LP77, and LP154, and reticulocyte(“Retic”) and wheat germ (“Wheat”) translation systems. The left half ofthe figure illustrates ³⁵S methionine incorporation in each of the threetemplates. The right half of the figure illustrates the resultingproducts after RNase A treatment of each of the three templates toremove the RNA coding region; shown are 35S methionine-labeledDNA-protein fusions. The DNA portion of each was identical to the oligo30-P. Thus, differences in mobility were proportional to the length ofthe coding regions, consistent with the existence of proteins ofdifferent length in each case.

FIG. 9 is a photograph demonstrating protease sensitivity of anRNA-protein fusion synthesized from LP154 and analyzed by denaturingpolyacrylamide gel electrophoresis. Lane 1 contains ³²P labeled 30-P.Lanes 2-4, 5-7, and 8-10 contain the ³⁵S labeled translation templatesrecovered from reticulocyte lysate reactions either without treatment,with RNase A treatment, or with RNase A and proteinase K treatment,respectively.

FIG. 10 is a photograph showing the results of immunoprecipitationreactions using in vitro translated 33 amino acid myc-epitope protein.Lanes 1 and 2 show the translation products of the myc epitope proteinand β-globin templates, respectively. Lanes 3-5 show the results ofimmunoprecipitation of the myc-epitope peptide using a c-myc monoclonalantibody and PBS, DB, and PBSTDS wash buffers, respectively. Lanes 6-8show the same immunoprecipitation reactions, but using the β-globintranslation product.

FIG. 11 is a photograph demonstrating immunoprecipitation of anRNA-protein fusion from an in vitro translation reaction. The picomolesof template used in the reaction are indicated. Lanes 1-4 show RNA124(the RNA portion of fusion LP154), and lanes 5-7 show RNA-protein fusionLP154. After immunoprecipitation using a c-myc monoclonal antibody andprotein G sepharose, the samples were treated with RNase A and T4polynucleotide kinase, then loaded on a denaturing urea polyacrylamidegel to visualize the fusion. In lanes 1-4, with samples containingeither no template or only the RNA portion of the long myc template(RNA124), no fusion was seen. In lanes 5-7, bands corresponding to thefusion were clearly visualized. The position of ³²P labeled 30-P isindicated, and the amount of input template is indicated at the top ofthe figure.

FIG. 12 is a graph showing a quantitation of fusion material obtainedfrom an in vitro translation reaction. The intensity of the fusion bandsshown in lanes 5-7 of FIG. 11 and the 30-P band (isolated in a parallelfashion on dT₂₅, not shown) were quantitated on phosphorimager platesand plotted as a function of input LP154 concentration. Recoveredmodified 30-P (left y axis) was linearly proportional to input template(x axis), whereas linker-peptide fusion (right y axis) was constant.From this analysis, it was calculated that ˜10¹² fusions were formed perml of translation reaction sample.

FIG. 13 is a schematic representation of thiopropyl sepharose and dT₂₅agarose, and the ability of these substrates to interact with theRNA-protein fusions of the invention.

FIG. 14 is a photograph showing the results of sequential isolation offusions of the invention. Lane 1 contains ³²P labeled 30-P. Lanes 2 and3 show LP154 isolated from translation reactions and treated with RNaseA. In lane 2, LP154 was isolated sequentially, using thiopropylsepharose followed by dT₂₅ agarose. Lane 3 shows isolation using onlydT₂₅ agarose. The results indicated that the product contained a freethiol, likely the penultimate cysteine in the myc epitope codingsequence.

FIGS. 15A and 15B are photographs showing the formation of fusionproducts using β-globin templates as assayed by SDS-tricine-PAGE(polyacrylamide gel electrophoresis). FIG. 15A shows incorporation of³⁵S using either no template (lane 1), a syn-β-globin template (lanes2-4), or an LP-β-globin template (lanes 5-7). FIG. 15B (lanes labeled asin FIG. 15A) shows ³⁵S-labeled material isolated by oligonucleotideaffinity chromatography. No material was isolated in the absence of a30-P tail (lanes 2-4).

FIGS. 16A-16C are diagrams and photographs illustrating enrichment ofmyc dsDNA versus pool dsDNA by in vitro selection. FIG. 16A is aschematic of the selection protocol. Four mixtures of the myc and pooltemplates were translated in vitro and isolated on dT₂₅ agarose followedby TP sepharose to purify the template fusions from unmodifiedtemplates. The mRNA-peptide fusions were then reverse transcribed tosuppress any secondary or tertiary structure present in the templates.Aliquots of each mixture were removed both before (FIG. 16B) and after(FIG. 16C) affinity selection, amplified by PCR in the presence of alabeled primer, and digested with a restriction enzyme that cleaved onlythe myc DNA. The input mixtures of templates were pure myc (lane 1), ora 1:20, 1:200, or 1:2000 myc:pool (lanes 2-4). The unselected materialdeviated from the input ratios due to preferential translation andreverse transcription of the myc template. The enrichment of the myctemplate during the selective step was calculated from the change in thepool:myc ratio before and after selection.

FIG. 17 is a photograph illustrating the translation of myc RNAtemplates. The following linkers were used: lanes 1-4, dA₂₇dCdCP; lanes5-8, dA₂₇rCrCP; and lanes 9-12, dA₂₁C₉C₉C₉dAdCdCP. In each lane, theconcentration of RNA template was 600 nM, and ³⁵S-Met was used forlabeling. Reaction conditions were as follows: lanes 1, 5, and 9, 30° C.for 1 hour; lanes 2, 6, and 10, 30° C. for 2 hours; lane 3, 7, and 11,30° C. for 1 hour, −20° C. for 16 hours; and lanes 4, 8, and 12, 30° C.for 1 hour, −20° C. for 16 hours with 50 mM Mg²+. In this Figure, “A”represents free peptide, and “B” represent mRNA-peptide fusion.

FIG. 18 is a photograph illustrating the translation of myc RNAtemplates labeled with ³²P. The linker utilized was dA₂₁C₉C₉C₉dAdCdCP.Translation was performed at 30° C. for 90 minutes, and incubations werecarried out at −20° C. for 2 days without additional Mg²⁺. Theconcentrations of mRNA templates were 400 nM (lane 3), 200 nM (lane 4),100 nM (lane 5), and 100 nM (lane 6). Lane 1 shows mRNA-peptide fusionlabeled with ³⁵S-Met. Lane 2 shows mRNA labeled with 32P. In lane 6, thereaction was carried out in the presence of 0.5 mM cap analog.

FIG. 19 is a photograph illustrating the translation of myc RNA templateusing lysate obtained from Ambion (lane 1), Novagen (lane 2), andAmersham (lane 3). The linker utilized was dA₂₇dCdCP. The concentrationof the template was 600 nM, and ³⁵S-Met was used for labeling.Translations were performed at 30° C. for 1 hour, and incubations werecarried out at −20° C. overnight in the presence of 50 mM Mg²⁺.

Described herein is a general method for the selection of proteins withdesired functions using fusions in which these proteins are covalentlylinked to their own messenger RNAs. These RNA-protein fusions aresynthesized by in vitro or in situ translation of mRNA pools containinga peptide acceptor attached to their 3′ ends (FIG. 1B). In one preferredembodiment, after readthrough of the open reading frame of the message,the ribosome pauses when it reaches the designed pause site, and theacceptor moiety occupies the ribosomal A site and accepts the nascentpeptide chain from the peptidyl-tRNA in the P site to generate theRNA-protein fusion (FIG. 1C). The covalent link between the protein andthe RNA (in the form of an amide bond between the 3′ end of the mRNA andthe C-terminus of the protein which it encodes) allows the geneticinformation in the protein to be recovered and amplified (e.g., by PCR)following selection by reverse transcription of the RNA. Once the fusionis generated, selection or enrichment is carried out based on theproperties of the mRNA-protein fusion, or, alternatively, reversetranscription may be carried out using the mRNA template while it isattached to the protein to avoid any effect of the single-stranded RNAon the selection. When the mRNA-protein construct is used, selectedfusions may be tested to determine which moiety (the protein, the RNA,or both) provides the desired function.

In one preferred embodiment, puromycin (which resembles tyrosyladenosine) acts as the acceptor to attach the growing peptide to itsmRNA. Puromycin is an antibiotic that acts by terminating peptideelongation. As a mimetic of aminoacyl-tRNA, it acts as a universalinhibitor of protein synthesis by binding the A site, accepting thegrowing peptide chain, and falling off the ribosome (at a Kd=10⁻⁴M)(Traut and Monro, J. Mol. Biol. 10:63 (1964); Smith et al., J. Mol.Biol. 13:617 (1965)). One of the most attractive features of puromycinis the fact that it forms a stable amide bond to the growing peptidechain, thus allowing for more stable fusions than potential acceptorsthat form unstable ester linkages. In particular, the peptidyl-puromycinmolecule contains a stable amide linkage between the peptide and theO-methyl tyrosine portion of the puromycin. The 0-methyl tyrosine is inturn linked by a stable amide bond to the 3′-amino group of the modifiedadenosine portion of puromycin.

Other possible choices for acceptors include tRNA-like structures at the3′ end of the mRNA, as well as other compounds that act in a mannersimilar to puromycin. Such compounds include, without limitation, anycompound which possesses an amino acid linked to an adenine or anadenine-like compound, such as the amino acid nucleotides,phenylalanyl-adenosine (A-Phe), tyrosyl adenosine (A-Tyr), and alanyladenosine (A-Ala), as well as amide-linked structures, such asphenylalanyl 3′ deoxy 3′ amino adenosine, alanyl 3′ deoxy 3′ aminoadenosine, and tyrosyl 3′ deoxy 3′ amino adenosine; in any of thesecompounds, any of the naturally-occurring L-amino acids or their analogsmay be utilized. In addition, a combined tRNA-like 3′structure-puromycin conjugate may also be used in the invention.

Shown in FIG. 2 is a preferred selection scheme according to theinvention. The steps involved in this selection are generally carriedout as follows.

Step 1. Preparation of the DNA template. As a step toward generating theRNA-protein fusions of the invention, the RNA portion of the fusion issynthesized. This may be accomplished by direct chemical RNA synthesisor, more commonly, is accomplished by transcribing an appropriatedouble-stranded DNA template.

Such DNA templates may be created by any standard technique (includingany technique of recombinant DNA technology, chemical synthesis, orboth). In principle, any method that allows production of one or moretemplates containing a known, random, randomized, or mutagenizedsequence may be used for this purpose. In one particular approach, anoligonucleotide (for example, containing random bases) is synthesizedand is amplified (for example, by PCR) prior to transcription. Chemicalsynthesis may also be used to produce a random cassette which is theninserted into the middle of a known protein coding sequence (see, forexample, chapter 8.2, Ausubel et al., Current Protocols in MolecularBiology, John Wiley & Sons and Greene Publishing Company, 1994). Thislatter approach produces a high density of mutations around a specificsite of interest in the protein.

An alternative to total randomization of a DNA template sequence ispartial randomization, and a pool synthesized in this way is generallyreferred to as a “doped” pool. An example of this technique, performedon an RNA sequence, is described, for example, by Ekland et al. (Nucl.Acids Research 23:3231 (1995)). Partial randomization may be performedchemically by biasing the synthesis reactions such that each baseaddition reaction mixture contains an excess of one base and smallamounts of each of the others; by careful control of the baseconcentrations, a desired mutation frequency may be achieved by thisapproach. Partially randomized pools may also be generated using errorprone PCR techniques, for example, as described in Beaudry and Joyce(Science 257:635 (1992)) and Bartel and Szostak (Science 261:1411(1993)).

Numerous methods are also available for generating a DNA constructbeginning with a known sequence and then creating a mutagenized DNApool. Examples of such techniques are described in Ausubel et al.(supra. chapter 8) and Sambrook et al. (Molecular Cloning: A LaboratoryManual, chapter 15, Cold Spring Harbor Press, New York, ₂ ^(nd) ed.(1989)). Random sequences may also be generated by the “shuffling”technique outlined in Stemmer (Nature 370: 389 (1994)).

To optimize a selection scheme of the invention, the sequences andstructures at the 5′ and 3′ ends of a template may also be altered.Preferably, this is carried out in two separate selections, eachinvolving the insertion of random domains into the template proximal tothe appropriate end, followed by selection. These selections may serve(i) to maximize the amount of fusion made (and thus to maximize thecomplexity of a library) or (ii) to provide optimized translationsequences. Further, the method may be generally applicable, combinedwith mutagenic PCR, to the optimization of translation templates both inthe coding and non-coding regions.

Step 2. Generation of RNA. As noted above, the RNA portion of anRNA-protein fusion may be chemically synthesized using standardtechniques of oligonucleotide synthesis. Alternatively, and particularlyif longer RNA sequences are utilized, the RNA portion is generated by invitro transcription of a DNA template. In one preferred approach, T7polymerase is used to enzymatically generate the RNA strand. Otherappropriate RNA polymerases for this use include, without limitation,the SP6, T3 and E. coli RNA polymerases (described, for example, inAusubel et al. (supra, chapter 3). In addition, the synthesized RNA maybe, in whole or in part, modified RNA. In one particular example,phosphorothioate RNA may be produced (for example, by T7 transcription)using modified ribonucleotides and standard techniques. Such modifiedRNA provides the advantage of being nuclease stable.

Step 3. Ligation of Puromycin to the Template. Next, puromycin (or anyother appropriate peptide acceptor) is covalently bonded to the templatesequence. This step may be accomplished using T4 RNA ligase to attachthe puromycin directly to the RNA sequence, or preferably the puromycinmay be attached by way of a DNA “splint” using T4 DNA ligase or anyother enzyme which is capable of joining together two nucleotidesequences (see FIG. 1B) (see also, for example, Ausubel et al., supra,chapter 3, sections 14 and 15). tRNA synthetases may also be used toattach puromycin-like compounds to RNA. For example, phenylalanyl tRNAsynthetase links phenylalanine to phenylalanyl-tRNA molecules containinga 3′ amino group, generating RNA molecules with puromycin-like 3′ ends(Fraser and Rich, Proc. Natl. Acad. Sci. USA 70:2671 (1973)). Otherpeptide acceptors which may be used include, without limitation, anycompound which possesses an amino acid linked to an adenine or anadenine-like compound, such as the amino acid nucleotides,phenylalanyl-adenosine (A-Phe), tyrosyl adenosine (A-Tyr), and alanyladenosine (A-Ala), as well as amide-linked structures, such asphenylalanyl 3′ deoxy 3′ amino adenosine, alanyl 3′ deoxy 3′ aminoadenosine, and tyrosyl 3′ deoxy 3′ amino adenosine; in any of thesecompounds, any of the naturally-occurring L-amino acids or their analogsmay be utilized. A number of peptide acceptors are described, forexample, in Krayevsky and Kukhanova, Progress in Nucleic Acids Researchand Molecular Biology 23:1 (1979).

Step 4. Generation and Recovery of RNA-Protein Fusions. To generateRNA-protein fusions, any in vitro or in situ translation system may beutilized. As shown below, eukaryotic systems are preferred, and twoparticularly preferred systems include the wheat germ and reticulocytelysate systems. In principle, however, any translation system whichallows formation of an RNA-protein fusion and which does notsignificantly degrade the RNA portion of the fusion is useful in theinvention. In addition, to reduce RNA degradation in any of thesesystems, degradation-blocking antisense oligonucleotides may be includedin the translation reaction mixture; such oligonucleotides specificallyhybridize to and cover sequences within the RNA portion of the moleculethat trigger degradation (see, for example, Hanes and Pluckthun, Proc.Natl. Acad. Sci USA 94:4937 (1997)).

As noted above, any number of eukaryotic translation systems areavailable for use in the invention. These include, without limitation,lysates from yeast, ascites, tumor cells (Leibowitz et al., Meth.Enzymol. 194:536 (1991)), and xenopus oocyte eggs. Useful in vitrotranslation systems from bacterial systems include, without limitation,those described in Zubay (Ann. Rev. Genet. 7:267 (1973)); Chen and Zubay(Meth. Enzymol. 101:44 (1983)); and Ellman (Meth. Enzymol. 202:301(1991)).

In addition, translation reactions may be carried out in situ. In oneparticular example, translation may be carried out by injecting mRNAinto Xenopus eggs using standard techniques.

Once generated, RNA-protein fusions may be recovered from thetranslation reaction mixture by any standard technique of protein or RNApurification. Typically, protein purification techniques are utilized.As shown below, for example, purification of a fusion may be facilitatedby the use of suitable chromatographic reagents such as dT₂₅ agarose orthiopropyl sepharose. Purification, however, may also or alternativelyinvolve purification based upon the RNA portion of the fusion;techniques for such purification are described, for example in Ausubelet al. (supra, chapter 4).

Step 5. Selection of the Desired RNA-Protein Fusion. Selection of adesired RNA-protein fusion may be accomplished by any means available toselectively partition or isolate a desired fusion from a population ofcandidate fusions. Examples of isolation techniques include, withoutlimitation, selective binding, for example, to a binding partner whichis directly or indirectly immobilized on a column, bead, membrane, orother solid support, and immunoprecipitation using an antibody specificfor the protein moiety of the fusion. The first of these techniquesmakes use of an immobilized selection motif which can consist of anytype of molecule to which binding is possible. A list of possibleselection motif molecules is presented in FIG. 2. Selection may also bebased upon the use of substrate molecules attached to an affinity label(for example, substrate-biotin) which react with a candidate molecule,or upon any other type of interaction with a fusion molecule. Inaddition, proteins may be selected based upon their catalytic activityin a manner analogous to that described by Bartel and Szostak for theisolation of RNA enzymes (supra); according to that particulartechnique, desired molecules are selected based upon their ability tolink a target molecule to themselves, and the functional molecules arethen isolated based upon the presence of that target. Selection schemesfor isolating novel or improved catalytic proteins using this sameapproach or any other functional selection are enabled by the presentinvention.

In addition, as described herein, selection of a desired RNA-proteinfusion (or its DNA copy) may be facilitated by enrichment for thatfusion in a pool of candidate molecules. To carry out such an optionalenrichment, a population of candidate RNA-protein fusions is contactedwith a binding partner (for example, one of the binding partnersdescribed above) which is specific for either the RNA portion or theprotein portion of the fusion, under conditions which substantiallyseparate the binding partner-fusion complex from unbound members in thesample. This step may be repeated, and the technique preferably includesat least two sequential enrichment steps, one in which the fusions areselected using a binding partner specific for the RNA portion andanother in which the fusions are selected using a binding partnerspecific for the protein portion. In addition, if enrichment stepstargeting the same portion of the fusion (for example, the proteinportion) are repeated, different binding partners are preferablyutilized. In one particular example described herein, a population ofmolecules is enriched for desired fusions by first using a bindingpartner specific for the RNA portion of the fusion and then, in twosequential steps, using two different binding partners, both of whichare specific for the protein portion of the fusion. Again, thesecomplexes may be separated from sample components by any standardseparation technique including, without limitation, column affinitychromatography, centrifugation, or immunoprecipitation.

Moreover, elution of an RNA-protein fusion from an enrichment (orselection) complex may be accomplished by a number of approaches. Forexample, as described herein, one may utilize a denaturing ornon-specific chemical elution step to isolate a desired RNA-proteinfusion. Such a step facilitates the release of complex components fromeach other or from an associated solid support in a relativelynon-specific manner by breaking non-covalent bonds between thecomponents and/or between the components and the solid support. Asdescribed herein, one exemplary denaturing or non-specific chemicalelution reagent is 4% HOAc/H₂O. Other exemplary denaturing ornon-specific chemical elution reagents include guanidine, urea, highsalt, detergent, or any other means by which non-covalent adducts maygenerally be removed. Alternatively, one may utilize a specific chemicalelution approach, in which a chemical is exploited that causes thespecific release of a fusion molecule. In one particular example, if thelinker arm of a desired fusion protein contains one or more disulfidebonds, bound fusion aptamers may be eluted by the addition, for example,of DTT, resulting in the reduction of the disulfide bond and release ofthe bound target.

Alternatively, elution may be accomplished by specifically disruptingaffinity complexes; such techniques selectively release complexcomponents by the addition of an excess of one member of the complex.For example, in an ATP-binding selection, elution is performed by theaddition of excess ATP to the incubation mixture. Finally, one may carryout a step of enzymatic elution. By this approach, a bound moleculeitself or an exogenously added protease (or other appropriate hydrolyticenzyme) cleaves and releases either the target or the enzyme. In oneparticular example, a protease target site may be included in either ofthe complex components, and the bound molecules eluted by addition ofthe protease. Alternately, in a catalytic selection, elution may be usedas a selection step for isolating molecules capable of releasing (forexample, cleaving) themselves from a solid support.

Step 6. Generation of a DNA Copy of the RNA Sequence using ReverseTranscriptase. If desired, a DNA copy of a selected RNA fusion sequenceis readily available by reverse transcribing that RNA sequence using anystandard technique (for example, using Superscript reversetranscriptase). This step may be carried out prior to the selection orenrichment step (for example, as described in FIG. 16), or followingthat step. Alternatively, the reverse transcription process may becarried out prior to the isolation of the fusion from the in vitro or insitu translation mixture.

Next, the DNA template is amplified, either as a partial or full-lengthdouble-stranded sequence. Preferably, in this step, full-length DNAtemplates are generated, using appropriate oligonucleotides and PCRamplification.

These steps, and the reagents and techniques for carrying out thesesteps, are now described in detail using particular examples. Theseexamples are provided for the purpose of illustrating the invention, andshould not be construed as limiting.

GENERATION OF TEMPLATES FOR RNA-PROTEIN FUSIONS

As shown in FIGS. 1A and 2, the selection scheme of the presentinvention preferably makes use of double-stranded DNA templates whichinclude a number of design elements. The first of these elements is apromoter to be used in conjunction with a desired RNA polymerase formRNA synthesis. As shown in FIG. 1A and described herein, the T7promoter is preferred, although any promoter capable of directingsynthesis from a linear double-stranded DNA may be used.

The second element of the template shown in FIG. 1A is termed the 5′untranslated region (or 5′UTR) and corresponds to the RNA upstream ofthe translation start site. Shown in FIG. 1A is a preferred 5′UTR(termed “TE”) which is a deletion mutant of the Tobacco Mosaic Virus 5′untranslated region and, in particular, corresponds to the basesdirectly 5′ of the TMV translation start; the sequence of this UTR is asfollows: rGrGrG rArCrA rArUrU rArCrU rArUrU rUrAtC rArArU rUrArC rA(with the first 3 G nucleotides being inserted to augment transcription)(SEQ ID NO: 5). Any other appropriate 5′ UTR may be utilized (see, forexample, Kozak, Microbiol. Rev. 47:1 (1983)).

The third element shown in FIG. 1A is the translation start site. Ingeneral, this is an AUG codon. However, there are examples where codonsother than AUG are utilized in naturally-occurring coding sequences, andthese codons may also be used in the selection scheme of the invention.

The fourth element in FIG. 1A is the open reading frame of the protein(termed ORF), which encodes the protein sequence. This open readingframe may encode any naturally-occurring, random, randomized,mutagenized, or totally synthetic protein sequence.

The fifth element shown in FIG. 1A is the 3′ constant region. Thissequence facilitates PCR amplification of the pool sequences andligation of the puromycin-containing oligonucleotide to the mRNA. Ifdesired, this region may also include a pause site, a sequence whichcauses the ribosome to pause and thereby allows additional time for anacceptor moiety (for example, puromycin) to accept a nascent peptidechain from the peptidyl-tRNA; this pause site is discussed in moredetail below.

To develop the present methodology, RNA-protein fusions were initiallygenerated using highly simplified mRNA templates containing 1-2 codons.This approach was taken for two reasons. First, templates of this sizecould readily be made by chemical synthesis. And, second, a small openreading frame allowed critical features of the reaction, includingefficiency of linkage, end heterogeneity, template dependence, andaccuracy of translation, to be readily assayed.

Design of Construct. A basic construct was used for generating testRNA-protein fusions. The molecule consisted of a mRNA containing aShine-Dalgamo (SD) sequence for translation initiation which contained a3 base deletion of the SD sequence from ribosomal protein L1 and whichwas complementary to 5 bases of 16S rRNA (i.e., rGrGrA rGrGrA rCrGrA rA)(SEQ ID NO: 6) (Stormo et al., Nucleic Acids Research 10:2971-2996(1982); Shine and Dalgamo, Proc. Natl. Acad. Sci. USA 71:1342-1346(1974); and Steitz and Jakes, Proc. Natl. Acad. Sci. USA 72:4734-4738(1975)), (ii) an AUG start codon, (iii) a DNA linker to act as a pausesite (i.e., 5′-(dA)₂₇), (iv) dCdC-3′, and (v) a 3′ puromycin (P). Thepoly dA sequence was chosen because it was known to template tRNA poorlyin the A site (Morgan et al., J. Mol. Biol. 26:477-497 (1967); Rickerand Kaji, Nucleic Acid Research 19:6573-6578 (1991)) and was designed toact as a good pause site. The length of the oligo dA linker was chosento span the −60-70 Å distance between the decoding site and the peptidyltransfer center of the ribosome. The dCdCP mimicked the CCA end of atRNA and was designed to facilitate binding of the puromycin to the Asite of the ribosome.

Chemical Synthesis of Minimal Template 43-P. To synthesize construct43-P (shown in FIG. 3), puromycin was first attached to a solid supportin such a way that it would be compatible with standard phosphoramiditeoligonucleotide synthesis chemistry. The synthesis protocol for thisoligo is outlined schematically in FIG. 3 and is described in moredetail below. To attach puromycin to a controlled pore glass (CPG) solidsupport, the amino group was protected with a trifluoroacetyl group asdescribed in Applied Biosystems User Bulletin #49 for DNA synthesizermodel 380 (1988). Next, protection of the 5′ OH was carried out using astandard DMT-Cl approach (Gait, Oligonucleotide Synthesis a practicalapproach The Practical Approach Series (IRL Press, Oxford, 1984)), andattachment to aminohexyl CPG through the 2′ OH was effected in exactlythe same fashion as the 3′ OH would be used for attachment of adeoxynucleoside (see FIG. 3 and Gait, supra, p. 47). The 5′DMT-CPG-linked protected puromycin was then suitable for chain extensionwith phosphoramidite monomers. The synthesis of the oligo proceeded inthe 3′→5′ direction in the order: (i) 3′ puromycin, (ii) pdCpdC, (iii)˜27 units of dA as a linker, (iv) AUG, and (v) the Shine-Dalgarnosequence. The sequence of the 43-P construct is shown below.

Synthesis of CPG Puromycin. The synthesis of protected CPG puromycinfollowed the general path used for deoxynucleosides as previouslyoutlined (Gait, Oligonucleotide Synthesis, A Practical Approach, ThePractical Approach Series (IRL Press, Oxford, 1984)). Major departuresincluded the selection of an appropriate N blocking group, attachment atthe 2′ OH to the solid support, and the linkage reaction to the solidsupport. In the case of the latter, the reaction was carried out at verylow concentrations of activated nucleotide as this material wassignificantly more precious than the solid support. The resulting yield(˜20 μmol/g support) was quite satisfactory considering the dilutereaction conditions.

Synthesis of N-Trifluoroacetyl Puromycin. 267 mg (0.490 mmol)Puromycin*HCl was first converted to the free base form by dissolving inwater, adding pH 11 carbonate buffer, and extracting (3X) intochloroform. The organic phase was evaporated to dryness and weighed (242mg, 0.513 mmol). The free base was then dissolved in 11 ml dry pyridineand 11 ml dry acetonitrile, and 139 μL (2.0 mmol) triethylamine (TEA)and 139 μI (1.0 mmol) of trifluoroacetic anhydride (TFAA) were addedwith stirring. TFAA was then added to the turbid solution in 20 μIaliquots until none of the starting material remained, as assayed bythin layer chromatography (tlc) (93:7, Chloroform/MeOH) (a total of 280μl). The reaction was allowed to proceed for one hour. At this point,two bands were revealed by thin layer chromatography, both of highermobility than the starting material. Workup of the reaction with NH₄OHand water reduced the product to a single band. Silica chromatography(93:7 Chloroform/MeOH) yielded 293 mg (0.515 mmol) of the product,N-TFA-Pur. The product of this reaction is shown schematically in FIG.4.

Synthesis of N-Trifluoroacetyl 5′-DMT Puromycin. The product from theabove reaction was aliquoted and coevaporated 2X with dry pyridine toremove water. Multiple tubes were prepared to test multiple reactionconditions. In a small scale reaction, 27.4 mg (48.2 μmoles) N-TFA-Purwere dissolved in 480 μl of pyridine containing 0.05 eq of DMAP and 1.4eq TEA. To this mixture, 20.6 mg of trityl chloride (60 μmol) was added,and the reaction was allowed to proceed to completion with stirring. Thereaction was stopped by addition of an equal volume of water(approximately 500 μl) to the solution. Because this reaction appearedsuccessfull, a large scale version was performed. In particular, 262 mg(0.467 mmol) N-TFA-Pur was dissolved in 2.4 ml pyridine followed byaddition of 1.4 eq of TEA, 0.05 eq of DMAP, and 1.2 eq of tritylchloride. After approximately two hours, an additional 50 mg (0.3 eq)dimethoxytrityl*Cl (DMT*Cl) was added, and the reaction was allowed toproceed for 20 additional minutes. The reaction was stopped by theaddition of 3 ml of water and coevaporated 3X with CH₃CN. The reactionwas purified by 95:5 Chloroform/MeOH on a 100 ml silica (dry) 2 mmdiameter column. Due to incomplete purification, a second identicalcolumn was run with 97.5:2.5 Chloroform/MeOH. The total yield was 325 mgor 0.373 mmol (or a yield of 72%). The product of this reaction is shownschematically in FIG. 4.

Synthesis of N-Trifluoroacetyl. 5′-DMT. 2′ Succinyl Puromycin. In asmall scale reaction, 32 mg (37 μmol) of the product synthesized abovewas combined with 1.2 eq of DMAP dissolved in 350 μl of pyridine. Tothis solution, 1.2 equivalents of succinic anhydride was added in 44 μlof dry CH₃CN and allowed to stir overnight. Thin layer chromatographyrevealed little of the starting material remaining. In a large scalereaction, 292 mg (336 μmol) of the previous product was combined with1.2 eq DMAP in 3 ml of pyridine. To this, 403 μl of 1M succinicanhydride in dry CH₃CN was added, and the mixture was allowed to stirovernight. Thin layer chromatography again revealed little of thestarting material remaining. The two reactions were combined, and anadditional 0.2 eq of DMAP and succinate were added. The product wascoevaporated with toluene 1X and dried to a yellow foam in high vacuum.CH₂Cl₂ was added (20 ml), and this solution was extracted twice with 15ml of 10% ice cold citric acid and then twice with pure water. Theproduct was dried, redissolved in 2 ml of CH₂Cl₂, and precipitated byaddition of 50 ml of hexane with stirring. The product was then vortexedand centrifuged at 600 rpm for 10 minutes in the clinical centrifuge.The majority of the eluent was drawn off, and the rest of the productwas dried, first at low vacuum, then at high vacuum in a dessicator. Theyield of this reaction was approximately 260 μmol for a stepwise yieldof 70%.

Synthesis of N-Trifluoroacetvl 5′-DMT. 2′ Succinvl. CPG Puromycin. Theproduct from the previous step was next dissolved with 1 ml of dioxanefollowed by 0.2 ml dioxane/0.2 ml pyridine. To this solution, 40 mg ofp-nitrophenol and 140 mg of dicyclohexylcarbodiimide (DCC) was added,and the reaction was allowed to proceed for 2 hours. The insolublecyclohexyl urea produced by the reaction was removed by centrifugation,and the product solution was added to 5 g of aminohexyl controlled poreglass (CPG) suspended in 22 ml of dry DMF and stirred overnight. Theresin was then washed with DMF, methanol, and ether, and dried. Theresulting resin was assayed as containing 22.6 μmol of trityl per g,well within the acceptable range for this type of support. The supportwas then capped by incubation with 15 ml of pyridine, 1 ml of aceticanhydride, and 60 mg of DMAP for 30 minutes. The resulting columnmaterial produced a negative (no color) ninhydrin test, in contrast tothe results obtained before blocking in which the material produced adark blue color reaction. The product of this reaction is shownschematically in FIG. 4.

Synthesis of mRNA-Puromycin Conjugate. As discussed above, a puromycintethered oligo may be used in either of two ways to generate amRNA-puromycin conjugate which acts as a translation template. Forextremely short open reading frames, the puromycin oligo is typicallyextended chemically with RNA or DNA monomers to create a totallysynthetic template. When longer open reading frames are desired, the RNAor DNA oligo is generally ligated to the 3′ end of an mRNA using a DNAsplint and T4 DNA ligase as described by Moore and Sharp (Science256:992 (1992)).

IN VITRO TRANSLATION AND TESTING OF RNA-PROTEIN FUSIONS

The templates generated above were translated in vitro using bothbacterial and eukaryotic in vitro translation systems as follows.

In Vitro Translation of Minimal Templates. 43-P and relatedRNA-puromycin conjugates were added to several different in vitrotranslation systems including: (i) the S30 system derived from E. coliMRE600 (Zubay, Ann. Rev. Genet. 7:267 (1973); Collins, Gene 6:29 (1979);Chen and Zubay, Methods Enzymol, 101:44 (1983); Pratt, in Transcriptionand Translation: A Practical Approach, B. D. Hammes, S. J. Higgins, Eds.(IRL Press, Oxford, 1984) pp. 179-209; and Ellman et al., MethodsEnzymol. 202:301 (1991)) prepared as described by Ellman et. al.(Methods Enzymol. 202:301 (1991)); (ii) the ribosomal fraction derivedfrom the same strain, prepared as described by Kudlicki et al. (Anal.Chem. 206:389 (1992)); and (iii) the S30 system derived from E. coliBL21, prepared as described by Lesley et al. (J. Biol. Chem. 266:2632(1991)). In each case, the premix used was that of Lesley et al. (J.Biol. Chem. 266:2632 (1991)), and the incubations were 30 minutes induration.

Testing the Nature of the Fusion. The 43-P template was first testedusing S30 translation extracts from E. coli. FIG. 5 (Reaction “A”)demonstrates the desired intramolecular (cis) reaction wherein 43-Pbinds the ribosome and acts as a template for and an acceptor of fMet atthe same time. The incorporation of ³⁵S-methionine and its position inthe template was first tested, and the results are shown in FIGS. 6A and6B. After extraction of the in vitro translation reaction mixture withphenol/chloroform and analysis of the products by SDS-PAGE, an ³⁵Slabeled band appeared with the same mobility as the 43-P template. Theamount of this material synthesized was dependent upon the Mg²⁺concentration (FIG. 6A). The optimum Mg²⁺ concentration appeared to bebetween 9 and 18 mM, which was similar to the optimum for translation inthis system (Zubay, Ann. Rev. Genet. 7:267 (1973); Collins, Gene 6:29(1979); Chen and Zubay, Methods Enzymol, 101:44 (1983); Pratt, inTranscription and Translation: A Practical Approach, B. D. Hammes, S. J.Higgins, Eds. (IRL Press, Oxford, 1984) pp. 179-209; Elhman et al.,Methods Enzymol. 202:301 (1991); Kudlicki et al., Anal. Chem. 206:389(1992); and Lesley et al., J. Biol. Chem. 266:2632 (1991)). Furthermore,the incorporated label was stable to treatment with NH₄0H (FIG. 6B),indicating that the label was located on the 3′ half of the molecule(the base-stable DNA portion) and was attached by a base-stable linkage,as expected for an amide bond between puromycin and fMet.

Ribosome and Template Dependence. To demonstrate that the reactionobserved above occurred on the ribosome, the effects of specificinhibitors of the peptidyl transferase function of the ribosome weretested (FIG. 6C), and the effect of changing the sequence coding formethionine was examined (FIG. 6D). FIG. 6C demonstrates clearly that thereaction was strongly inhibited by the peptidyl transferase inhibitors,virginiamycin, gougerotin, and chloramphenicol (Monro and Vazquez, J.Mol. Biol. 28:161-165 (1967); and Vazquez and Monro, Biochemica etBiophysical Acta 142:155-173 (1967)). FIG. 6D demonstrates that changinga single base in the template from A to C abolished incorporation of ³⁵Smethionine at 9 mM Mg²⁺, and greatly decreased it at 18 mM (consistentwith the fact that high levels of Mg²⁺ allow misreading of the message).These experiments demonstrated that the reaction occurred on theribosome in a template dependent fashion.

Linker Length. Also tested was the dependence of the reaction on thelength of the linker (FIG. 6E). The original template was designed sothat the linker spanned the distance from the decoding site (occupied bythe AUG of the template) to the acceptor site (occupied by the puromycinmoiety), a distance which was approximately the same length as thedistance between the anticodon loop and the acceptor stem in a tRNA, orabout 60-70 Å. The first linker tested was 30 nucleotides in length,based upon a minimum of 3.4 Å per base (>102 Å). In the range between 30and 21 nucleotides (n 27-18; length >102-71 Å), little change was seenin the efficiency of the reaction. Accordingly, linker length may bevaried. While a linker of between 21 and 30 nucleotides represents apreferred length, linkers shorter than 80 nucleotides and, preferably,shorter than 45 nucleotides may also be utilized in the invention.

Intramolecular vs. Intermolecular Reactions. Finally, we tested whetherthe reaction occurred in an intramolecular fashion (FIG. 5, Reaction“A”) as desired or intermolecularly (FIG. 5, Reaction “B”). This wastested by adding oligonucleotides with 3′ puromycin but no ribosomebinding sequence (i.e., templates 25-P, 13-P, and 30-P) to thetranslation reactions containing the 43-P template (FIGS. 6F, 6G, and6H). If the reaction occurred by an intermolecular mechanism, theshorter oligos would also be labeled. As demonstrated in FIGS. 6F-H,there was little incorporation of ³⁵S methionine in the three shorteroligos, indicating that the reaction occurred primarily in anintramolecular fashion. The sequences of 25-P (SEQ ID NO:10), 13-P (SEQID NO: 9), and 30-P (SEQ ID NO:8) are shown below.

Reticulocyte Lysate. FIG. 6H demonstrates that ³⁵S-methionine may beincorporated in the 43-P template using a rabbit reticulocyte lysate(see below) for in vitro translation, in addition to the E. coli lysatesused above. This reaction occurred primarily in an intramolecularmechanism, as desired.

SYNTHESIS AND TESTING OF FUSIONS CONTAINING A C-MYC EPITOPE TAG

Exemplary fusions were also generated which contained, within theprotein portion, the epitope tag for the c-myc monoclonal antibody 9E10(Evan et al., Mol. Cell Biol. 5:3610 (1985)).

Design of Templates. Three initial epitope tag templates (i.e., LP77,LP154, and Pool #1) were designed and are shown in FIGS. 7A-C. The firsttwo templates contained the c-myc epitope tag sequence EQKLISEEDL (SEQID NO:2), and the third template was the design used in the synthesis ofa random selection pool. LP77 encoded a 12 amino acid sequence, with thecodons optimized for bacterial translation. LP154 and its derivativescontained a 33 amino acid mRNA sequence in which the codons wereoptimized for eukaryotic translation. The encoded amino acid sequence ofMAEEQKLISEEDLLRKRREQKLKHKLEQLRNSCA (SEQ ID NO:7) corresponded to theoriginal peptide used to isolate the 9E10 antibody. Pool#l contained 27codons of NNG/C (to generate random peptides) followed by a sequencecorresponding to the last seven amino acids of the myc peptide (whichwere not part of the myc epitope sequence). These sequences are shownbelow.

Reticulocyte vs. Wheat Germ In Vitro Translation Systems. The 43-P,LP77, and LP 154 templates were tested in both rabbit reticulocyte andwheat germ extract (Promega, Boehringer Mannheim) translation systems(FIG. 8). Translations were performed at 30° C. for 60 minutes.Templates were isolated using dT₂₅ agarose at 4° C. Templates wereeluted from the agarose using 15 mM NaOH, 1 mM EDTA, neutralized withNaOAc/HOAc buffer, immediately ethanol precipitated (2.5-3 vol), washed(with 100% ethanol), and dried on a speedvac concentrator. FIG. 8 showsthat ³⁵S methionine was incorporated into all three templates, in boththe wheat germ and reticulocyte systems. Less degradation of thetemplate was observed in the fusion reactions from the reticulocytesystem and, accordingly, this system is preferred for the generation ofRNA-protein fusions. In addition, in general, eukaryotic systems arepreferred over bacterial systems. Because eukaryotic cells tend tocontain lower levels of nucleases, mRNA lifetimes are generally 10-100times longer in these cells than in bacterial cells. In experimentsusing one particular E. coli translation system, generation of fusionswas not observed using a template encoding the c-myc epitope; labelingthe template in various places demonstrated that this was likely due todegradation of both the RNA and DNA portions of the template.

To examine the peptide portion of these fusions, samples were treatedwith RNase to remove the coding sequences. Following this treatment, the43-P product ran with almost identical mobility to the ³²P labeled 30-Poligo, consistent with a very small peptide (perhaps only methionine)added to 30-P. For LP77, removal of the coding sequence produced aproduct with lower mobility than the 30-P oligo, consistent with thenotion that a 12 amino acid peptide was added to the puromycin. Finally,for LP154, removal of the coding sequence produced a product of yetlower mobility, consistent with a 33 amino acid sequence attached to the30-P oligo. No oligo was seen in the RNase-treated LP154 reticulocytelane due to a loading error. In FIG. 9, the mobility of this product wasshown to be the same as the product generated in the wheat germ extract.In sum, these results indicated that RNase resistant products were addedto the ends of the 30-P oligos, that the sizes of the products wereproportional to the length of the coding sequences, and that theproducts were quite homogeneous in size. In addition, although bothsystems produced similar fusion products, the reticulocyte systemappeared superior due to higher template stability.

Sensitivity to RNase A and Proteinase K. In FIG. 9, sensitivity to RNaseA and proteinase K were tested using the LP154 fusion. As shown in lanes2-4, incorporation of ³⁵S methionine was demonstrated for the LP154template. When this product was treated with RNase A, the mobility ofthe fusion decreased, but was still significantly higher than the ³²Plabeled 30-P oligonucleotide, consistent with the addition of a 33 aminoacid peptide to the 3′ end. When this material was also treated withproteinase K, the ³⁵S signal completely disappeared, again consistentwith the notion that the label was present in a peptide at the 3′ end ofthe 30-P fragment. Similar results have been obtained in equivalentexperiments using the 43-P and LP77 fusions.

To confirm that the template labeling by ³⁵S Met was a consequence oftranslation, and more specifically resulted from the peptidyltransferase activity of the ribosome, the effect of various inhibitorson the labeling reaction was examined. The specific inhibitors ofeukaryotic peptidyl transferase, anisomycin, gougerotin, and sparsomycin(Vazquez, Inhibitors of Protein Biosynthesis (Springer-Verlag, NewYork), pp. 312 (1979)), as well as the translocation inhibitorscycloheximide and emetine (Vazquez, Inhibitors of Protein Biosynthesis(Springer-Verlag, New York), pp. 312 (1979)) all decreased RNA-peptidefusion formation by ˜95% using the long myc template and a reticulocytelysate translation extract.

Immunoprecipitation Experiments. In an experiment designed to illustratethe efficacy of immunoprecipitating an mRNA-peptide fusion, an attemptwas made to immunoprecipitate a free c-myc peptide generated by in vitrotranslation. FIG. 10 shows the results of these experiments assayed onan SDS PAGE peptide gel. Lanes 1 and 2 show the labeled material fromtranslation reactions containing either RNA124 (the RNA portion ofLP154) or β-globin mRNA. Lanes 3-8 show the immunoprecipitation of thesereaction samples using the c-myc monoclonal antibody 9E10, under severaldifferent buffer conditions (described below). Lanes 3-5 show that thepeptide derived from RNA124 was effectively immunoprecipitated, with thebest case being lane 4 where ˜83% of the total TCA precipitable countswere isolated. Lanes 6-8 show little of the β-globin protein, indicatinga purification of>100 fold. These results indicated that the peptidecoded for by RNA124 (and by LP154) can be quantitatively isolated bythis immunoprecipitation protocol.

Immunoprecipitation of the Fusion. We next tested the ability toimmunoprecipitate a chimeric RNA-peptide product, using an LP 154translation reaction and the c-myc monoclonal antibody 9E10 (FIG. 11).The translation products from a reticulocyte reaction were isolated byimmunoprecipitation (as described herein) and treated with 1 μg of RNaseA at room temperature for 30 minutes to remove the coding sequence. Thisgenerated a 5′OH, which was ³²P labeled with T4 polynucleotide kinaseand assayed by denaturing PAGE. FIG. 11 demonstrates that a product witha mobility similar to that seen for the fusion of the c-myc epitope with30-P generated by RNase treatment of the LP154 fusion (see above) wasisolated, but no corresponding product was made when only the RNAportion of the template (RNA124) was translated. In FIG. 12, thequantity of fusion protein isolated was determined and was plottedagainst the amount of unmodified 30-P (not shown in this figure).Quantitation of the ratio of unmodified linker to linker-myc peptidefusion shows that 0.2-0.7% of the input message was converted to fusionproduct. A higher fraction of the input RNA was converted to fusionproduct in the presence of a higher ribosome/template ratio; over therange of input mRNA concentrations that were tested, approximately0.8-1.0×10¹² fusion molecules were made per ml of translation extract.

In addition, our results indicated that the peptides attached to the RNAspecies were encoded by that mRNA, i.e. the nascent peptide was nottransferred to the puromycin of some other mRNA. No indication ofcross-transfer was seen when a linker (30-P) was coincubated with thelong myc template in translation extracts in ratios as high as 20:1, nordid the presence of free linker significantly decrease the amount oflong myc fusion produced. Similarly, co-translation of the short andlong templates, 43-P and LP154, produced only the fusion products seenwhen the templates were translated alone, and no products ofintermediate mobility were observed, as would be expected for fusion ofthe short template with the long myc peptide. Both of these resultssuggested that fusion formation occurred primarily between a nascentpeptide and mRNA bound to the same ribosome.

Sequential Isolation. As a further confirmation of the nature of the invitro translated LP154 template product, we examined the behavior ofthis product on two different types of chromatography media. Thiopropyl(TP) sepharose allows the isolation of a product containing a freecysteine (for example, the LP154 product which has a cysteine residueadjacent to the C terminus) (FIG. 13). Similarly, dT₂₅ agarose allowsthe isolation of templates containing a poly dA sequence (for example,30-P) (FIG. 13). FIG. 14 demonstrates that sequential isolation on TPsepharose followed by dT₂₅ agarose produced the same product asisolation on dT₂₅ agarose alone. The fact that the in vitro translationproduct contained both a poly-A tract and a free thiol stronglyindicated that the translation product was the desired RNA-peptidefusion.

The above results are consistent with the ability to synthesizemRNA-peptide fusions and to recover them intact from in vitrotranslation extracts. The peptide portions of fusions so synthesizedappeared to have the intended sequences as demonstrated byimmunoprecipitation and isolation using appropriate chromatographictechniques. According to the results presented above, the reactions areintramolecular and occur in a template dependent fashion. Finally, evenwith a template modification of less than 1%, the present systemfacilitates selections based on candidate complexities of about 10¹³molecules.

C-Myc Epitope Recovery Selection. To select additional c-myc epitopes, alarge library of translation templates (for example, 10¹⁵ members) isgenerated containing a randomized region (see FIG. 7C and below). Thislibrary is used to generate ˜10¹²-10¹³ fusions (as described herein)which are treated with the anti-c-myc antibody (for example, byimmunoprecipitation or using an antibody immobilized on a column orother solid support) to enrich for c-myc-encoding templates in repeatedrounds of in vitro selection.

Models for Fusion Formation. Without being bound to a particular theory,we propose a model for the mechanism of fusion formation in whichtranslation initiates normally and elongation proceeds to the end of theopen reading frame. When the ribosome reaches the DNA portion of thetemplate, translation stalls. At this point, the complex can partitionbetween two fates: dissociation of the nascent peptide, or transfer ofthe nascent peptide to the puromycin at the 3′-end of the template. Theefficiency of the transfer reaction is likely to be controlled by anumber of factors that influence the stability of the stalledtranslation complex and the entry of the 3′-puromycin residue into the Asite of the peptidyl transferase center. After the transfer reaction,the mRNA-peptide fusion likely remains complexed with the ribosome sincethe known release factors cannot hydrolyze the stable amide linkagebetween the RNA and peptide domains.

Both the classical model for elongation (Watson, Bull. Soc. Chim. Biol.46:1399 (1964)) and the intermediate states model (Moazed and Noller,Nature 342:142 (1989)) require that the A site be empty for puromycinentry into the peptidyl transferase center. For the puromycin to enterthe empty A site, the linker must either loop around the outside of theribosome or pass directly from the decoding site through the A site tothe peptidyl transferase center. The data described herein do notclearly distinguish between these alternatives because the shortestlinker tested (21 nts) is still long enough to pass around the outsideof the ribosome. In some models of ribosome structure (Frank et al.,Nature 376:441 (1995)), the mRNA is threaded through a channel thatextends on either side of the decoding site, in which case unthreadingof the linker from the channel would be required to allow the puromycinto reach the peptidyl transferase center through the A site.

Transfer of the nascent peptide to the puromycin appeared to be slowrelative to the elongation process as demonstrated by the homogeneityand length of the peptide attached to the linker. If the puromycincompeted effectively with aminoacyl tRNAs during elongation, thelinker-peptide fusions present in the fusion products would be expectedto be heterogeneous in size. Furthermore, the ribosome did not appear toread into the linker region as indicated by the similarity in gelmobilities between the Met-template fusion and the unmodified linker.dA_(3n) should code for (lysine), which would certainly decrease themobility of the linker. The slow rate of unthreading of the mRNA mayexplain the slow rate of fusion formation relative to the rate oftranslocation. Preliminary results suggest that the amount of fusionproduct formed increases markedly following extended post-translationincubation at low temperature, perhaps because of the increased timeavailable for transfer of the nascent peptide to the puromycin.

DETAILED MATERIALS AND METHODS

Described below are detailed materials and methods relating to the invitro translation and testing of RNA-protein fusions, including fusionshaving a myc epitope tag.

Sequences. A number of oligonucleotides were used above for thegeneration of RNA-protein fusions. These oligonucleotides have thefollowing sequences.

NAME SEQUENCE 30-P 5′AAA AAA AAA AAA AAA AAA AAA AAA AAA CCP (SEQ IDNO:8) 13-P 5′AAA AAA AAA ACC P (SEQ ID NO:9) 25-P 5′CGC GGT TTT TAT TTTTTT TTT TCC P (SEQ ID NO:10) 43-P 5′rGrGrA rGrGrA rCrGrA rArArU rGAA AAAAAA AAA AAA AAA AAA AAA AAA ACC P (SEQ ID NO:11) 43-P[CUG] 5′rGrGrArGrGrA rCrGrA rArCrU rGAA AAA AAA AAA AAA AAA AAA AAA AAA ACC P (SEQ IDNO:12) 40-P 5′rGrGrA rGrGrA rCrGrA rArCrU rGAA AAA AAA AAA AAA AAA AAAAAA ACC P(SEQ ID NO:13) 37-P 5′rGrGrA rGrGrA rCrGrA rArCrU rGAA AAA AAAAAA AAA AAA AAA ACC P (SEQ ID NO:14) 34-P 5′rGrGrA rGrGrA rCrGrA rArCrUrGAA AAA AAA AAA AAA AAA ACC P (SEQ ID NO:15) 31-P 5′rGrGrA rGrGrArCrGrA rArCrU rGAA AAA AAA AAA AAA ACC P (SEQ ID NO:16) LP77 5′rGrGrGrArGrG rArCrG rArArA rUrGrG rArArC rArGrA rArArC rUrGrA rUrCrU rCrUrGrArArG rArArG rArCrC rUrGrA rArC AAA AAA AAA AAA AAA AAA AAA AAA AAA CCP(SEQ ID NO:1) LP154 5′rCrGrG rArCrA rArUrU rArCrU rArUrU rUrArC rArArUrUrArC rA rArUrG rGrCrU rGrArA rGrArA rCrArG rArArA rCrUrG rArUrC rUrCrUrGrArA rCrArA rGrArC rCrUrG rCrUrG rCrGrU rArArA rCrGrU rCrGrU rCrArArCrArG rCrUrG rArArA rCrArC rArArA rCrUrG rGrArA rCrArG rCrUrG rCrGrUrArArC rUrCrU rUrGrC rGrCrU AAA AAA AAA AAA AAA AAA AAA AAA AAA CCP (SEQID NO:3) LP160 5′5′rGrGrG rArCrA rArUrU rArCrU rArUrU rUrArC rArArUrUrArC rA rArUrG rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrSrNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrSrNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rNrNrS rCrArGrCrUrG rCrGrU rArArC rUrCrU rUrGrC rGrCrU AAA AAA AAA AAA AAA AAA AAAAAA AAA CCP (SEQ ID

All oligonucleotides are listed in the 5′ to 3′ direction.Ribonucleotide bases are indicated by lower case “r” prior to thenucleotide designation; P is puromycin; rN indicates equal amounts ofrA, rG, rC, and rU; rS indicates equal amounts of rG and rC; and allother base designations indicate DNA oligonucleotides.

Chemicals. Puromycin HCl, long chain alkylamine controlled pore glass,gougerotin, chloramphenicol, virginiamycin, DMAP, dimethyltritylchloride, and acetic anhydride were obtained from Sigma Chemical (St.Louis, Mo.). Pyridine, dimethylformamide, toluene, succinic anhydride,and para-nitrophenol were obtained from Fluka Chemical (Ronkonkoma,N.Y.). Beta-globin mRNA was obtained from Novagen (Madison, Wis.). TMVRNA was obtained from Boehringer Mannheim (Indianapolis, Ind.).

Enzymes. Proteinase K was obtained from Promega (Madison, Wis.).DNase-free RNAase was either produced by the protocol of Sambrook et al.(supra) or purchased from Boehringer Mannheim. T7 polymerase was made bythe published protocol of Grodberg and Dunn (J. Bacteriol. 170:1245(1988)) with the modifications of Zawadzki and Gross (Nucl. Acids Res.19:1948 (1991)). T4 DNA ligase was obtained from New England Biolabs(Beverly, Mass.).

Quantitation of Radiolabel Incorporation. For radioactive gels bands,the amount of radiolabel (³⁵S or ³²P) present in each band wasdetermined by quantitation either on a Betagen 603 blot analyzer(Betagen, Waltham, Mass.) or using phosphorimager plates (MolecularDynamics, Sunnyvale, Calif.). For liquid and solid samples, the amountof radiolabel (³⁵S or ³²p) present was determined by scintillationcounting (Beckman, Columbia, Md.).

Gel Images. Images of gels were obtained by autoradiography (using KodakXAR film) or using phosphorimager plates (Molecular Dynamics).

Synthesis of CPG Puromvcin. Detailed protocols for synthesis ofCPG-puromycin are outlined above.

Enzymatic Reactions. In general, the preparation of nucleic acids forkinase, transcription, PCR, and translation reactions using E. coliextracts was the same. Each preparative protocol began with extractionusing an equal volume of 1:1 phenol/chloroform, followed bycentrifugation and isolation of the aqueous phase. Sodium acetate (pH5.2) and spermidine were added to a final concentration of 300 mM and 1mM respectively, and the sample was precipitated by addition of 3volumes of 100% ethanol and incubation at −70° C. for 20 minutes.Samples were centrifuged at >12,000 g, the supernatant was removed, andthe pellets were washed with an excess of 95% ethanol, at 0° C. Theresulting pellets were then dried under vacuum and resuspended.

Oligonucleotides. All synthetic DNA and RNA was synthesized on aMillipore Expedite synthesizer using standard chemistry for each assupplied from the manufacturer (Milligen, Bedford, Mass.).Oligonucleotides containing 3′ puromycin were synthesized using CPGpuromycin columns packed with 30-50 mg of solid support (˜20 μmolepuromycin/gram). Oligonucleotides containing a 3′ biotin weresynthesized using 1 μmole bioteg CPG columns from Glen Research(Sterling, Va.). Oligonucleotides containing a 5′ biotin weresynthesized by addition of bioteg phosphoramidite (Glen Research) as the5′ base. Oligonucleotides to be ligated to the 3′ ends of RNA moleculeswere either chemically phosphorylated at the 5′ end (using chemicalphosphorylation reagent from Glen Research) prior to deprotection orenzymatically phosphorylated using ATP and T4 polynucleotide kinase (NewEngland Biolabs) after deprotection. Samples containing only DNA (and 3′puromycin or 3′ biotin) were deprotected by addition of 25% NH₄OHfollowed by incubation for 12 hours at 55° C. Samples containing RNAmonomers (e.g., 43-P) were deprotected by addition of ethanol (25%(v/v)) to the NH₄OH solution and incubation for 12 hours at 55° C. The2′OH was deprotected using 1M TBAF in THF (Sigma) for 48 hours at roomtemperature. TBAF was removed using a NAP-25 Sephadex column (Pharmacia,Piscataway, N.J.).

Deprotected DNA and RNA samples were then purified using denaturingPAGE, followed by either soaking or electro-eluting from the gel usingan Elutrap (Schleicher and Schuell, Keene, N.H.) and desalting usingeither a NAP-25 Sephadex column or ethanol precipitation as describedabove.

Myc DNA construction. Two DNA templates containing the c-myc epitope tagwere constructed. The first template was made from a combination of theoligonucleotides 64.27 (5′-GTT CAG GTC TTC TTG AGA GAT CAG TTT CTG TTCCAT TTC GTC CTC CCT ATA GTG AGT CGT ATT A-3′) (SEQ ID NO:18) and 18.109(5′-TAA TAC GAC TCA CTA TAG-3′) (SEQ ID NO:19). Transcription using thistemplate produced RNA 47.1 which coded for the peptide MEQKLISEEDLN (SEQID NO:20). Ligation of RNA 47.1 to 30-P yielded LP77 shown in FIG. 7A.

The second template was made first as a single oligonucleotide 99 basesin length, having the designation RWR 99.6 and the sequence 5′AGC GCAAGA GTT ACG CAG CTG TTC CAG TTT GTG TTT CAG CTG TTC ACG ACG TTT ACG CAGCAG GTC TTC TTC AGA GAT CAG TTT CTG TTC TTC AGC CAT-3′ (SEQ ID NO:21).Double stranded transcription templates containing this sequence wereconstructed by PCR with the oligos RWR 21.103 (5′-AGC GCA AGA GTT ACGCAG CTG-3′) (SEQ ID NO:22) and RWR 63.26 (5′TAA TAC GAC TCA CTA TAG GGACAA TTA CTA TTT ACA ATT ACA ATG GCT GAA GAA CAG AAA CTG-3′) (SEQ IDNO:23) according to published protocols (Ausubel et al., supra, chapter15). Transcription using this template produced an RNA referred to asRNA124 which coded for the peptide MAEEQKLISEEDLLRKRREQLKHKLEQLRNSCA(SEQ ID NO:24). This peptide contained the sequence used to raisemonoclonal antibody 9E10 when conjugated to a carrier protein (OncogeneScience Technical Bulletin). RNA124 was 124 nucleotides in length, andligation of RNA124 to 30-P produced LP154 shown in FIG. 7B. The sequenceof RNA 124 is as follows (SEQ ID NO:32): 5′-rGrGrG rArCrA rArUrU rArCrUrArUrU rUrArC rArArU rUrArC rArArUrG rGrCrU rGrArA rGrArA rCrArG rArArArCrUrG rArUrC rUrCrU rGrArA rGrArA rGrArC rCrUrG rCrUrG rCrGrU rArArArCrGrU rCrGrU rGrArA rCrArG rCrUrG rArArA rCrArC rArArA rCrUrG rGrArArCrArG rCrUrG rCrGrU rArArC rUrCrU rUrGrC rGrCrU-3′

Randomized Pool Construction. The randomized pool was constructed as asingle oligonucleotide 130 bases in length denoted RWR130.1. Beginningat the 3′ end, the sequence was 3′ CCCTGTTAATGATAAATGTTAATGTTAC (NNS)27GTC GAC GCA TTG AGA TAC CGA-5′ (SEQ ID NO:25). N denotes a randomposition, and this sequence was generated according to the standardsynthesizer protocol. S denotes an equal mix of dG and dC bases. PCR wasperformed with the oligonucleotides 42.108 (5′-TAA TAC GAC TCA CTA TAGGGA CAA TTA CTA TTT ACA ATT ACA) (SEQ ID NO:26) and 21.103 (5′-AGC GCAAGA GTT ACG CAG CTG) (SEQ ID NO:27). Transcription off this templateproduced an RNA denoted pool 130.1. Ligation of pool 130.1 to 30-Pyielded Pool #1 (also referred to as LP160) shown in FIG. 7C.

Seven cycles of PCR were performed according to published protocols(Ausubel et al., supra) with the following exceptions: (i) the startingconcentration of RWR130.1 was 30 nanomolar, (ii) each primer was used ata concentration of 1.5 μM, (iii) the dNTP concentration was 400 μM foreach base, and (iv) the Taq polymerase (Boehringer Mannheim) was used at5 units per 100 μl. The double stranded product was purified onnon-denaturing PAGE and isolated by electroelution. The amount of DNAwas determined both by UV absorbance at 260 nm and ethidium bromidefluorescence comparison with known standards.

Enzymatic Synthesis of RNA. Transcription reactions from double strandedPCR DNA and synthetic oligonucleotides were performed as describedpreviously (Milligan and Uhlenbeck, Meth. Enzymol. 180:51 (1989)). Fulllength RNA was purified by denaturing PAGE, electroeluted, and desaltedas described above. The pool RNA concentration was estimated using anextinction coefficient of 1300 O.D./μmole; RNA124, 1250 O.D./μmole; RNA47.1, 480 O.D./μmole. Transcription from the double stranded pool DNAproduced ˜90 nanomoles of pool RNA.

Enzymatic Synthesis of RNA-Puromycin Conjugates. Ligation of the myc andpool messenger RNA sequences to the puromycin containing oligonucleotidewas performed using a DNA splint, termed 19.35 (5′-TTT TTT TTT TAG CGCAAG A) (SEQ ID NO:28) using a procedure analogous to that described byMoore and Sharp (Science 250:992 (1992)). The reaction consisted ofmRNA, splint, and puromycin oligonucleotide (30-P, dA27dCdCP) in a moleratio of 0.8: 0.9: 1.0 and 1-2.5 units of DNA ligase per picomole ofpool mRNA. Reactions were conducted for one hour at room temperature.For the construction of the pool RNA fusions, the mRNA concentration was6.6 μmolar. Following ligation, the RNA-puromycin conjugate was preparedas described above for enzymatic reactions. The precipitate wasresuspended, and full length fusions were purified on denaturing PAGEand isolated by electroelution as described above. The pool RNAconcentration was estimated using an extinction coefficient of 1650O.D./μmole and the myc template 1600 O.D./μmole. In this way, 2.5nanomoles of conjugate were generated.

Preparation of dT₂₅ Steptavidin Agarose. dT₂₅ containing a 3′ biotin(synthesized on bioteg phosphoramidite columns (Glen Research)) wasincubated at 1-10 μM with a slurry of streptavidin agarose (50% agaroseby volume, Pierce, Rockford, Ill.) for 1 hour at room temperature in TE(10 mM Tris Chloride pH 8.2, 1 mM EDTA) and washed. The binding capacityof the agarose was then estimated optically by the disappearance ofbiotin-dT₂₅ from solution and/or by titration of the resin with knownamounts of complementary oligonucleotide.

Translation Reactions using E. coli Derived Extracts and Ribosomes. Ingeneral, translation reactions were performed with purchased kits (forexample, E. coli S30 Extract for Linear Templates, Promega, Madison,Wis.). However, E. coli MRE600 (obtained from the ATCC, Rockville, Md.)was also used to generate S30 extracts prepared according to publishedprotocols (for example, Ellman et al., Meth. Enzymol. 202:301(1991)), aswell as a ribosomal fraction prepared as described by Kudlicki et al.(Anal. Biochem. 206:389 (1992)). The standard reaction was performed ina 50 μl volume with 20-40 μCi of ³⁵S methionine as a marker. Thereaction mixture consisted of 30% extract v/v, 9-18 mM MgCl₂, 40% premixminus methionine (Promega) v/v, and 5 μM of template (e.g., 43-P). Forcoincubation experiments, the oligos 13-P and 25-P were added at aconcentration of 5 μM. For experiments using ribosomes, 3 μl of ribosomesolution was added per reaction in place of the lysate. All reactionswere incubated at 37° C. for 30 minutes. Templates were purified asdescribed above under enzymatic reactions.

Wheat Germ Translation Reactions. The translation reactions in FIG. 8were performed using purchased kits lacking methionine (Promega),according to the manufacturer's recommendations. Template concentrationswere 4 μM for 43-P and 0.8 μM for LP77 and LP154. Reactions wereperformed at 25° C. with 30 μCi ³⁵S methionine in a total volume of 25μl.

Reticulocyte Translation Reactions. Translation reactions were performedeither with purchased kits (Novagen, Madison, Wis.) or using extractprepared according to published protocols (Jackson and Hunt, Meth.Enzymol. 96:50 (1983)). Reticulocyte-rich blood was obtained fromPel-Freez Biologicals (Rogers, Ark.). In both cases, the reactionconditions were those recommended for use with Red Nova Lysate(Novagen). Reactions consisted of 100 mM KCl, 0.5 mM MgOAc, 2 mM DTT, 20mM HEPES pH 7.6, 8 mM creatine phosphate, 25 μM in each amino acid (withthe exception of methionine if ³⁵S Met was used), and 40% v/v of lysate.Incubation was at 30° C. for 1 hour. Template concentrations depended onthe experiment but generally ranged from 50 nM to 1 μM with theexception of 43-P (FIG. 6H) which was 4 μM.

For generation of the randomized pool, 10 ml of translation reaction wasperformed at a template concentration of ˜0.1 μM (1.25 nanomoles oftemplate). In addition, ³²P labeled template was included in thereaction to allow determination of the amount of material present ateach step of the purification and selection procedure. After translationat 30° C. for one hour, the reaction was cooled on ice for 30-60minutes.

Isolation of Fusion with dT₂₅ Streptavidin Agarose. After incubation,the translation reaction was diluted approximately 150 fold intoisolation buffer (1.0 M NaCl, 0.1M Tris chloride pH 8.2, 10 mM EDTA, 1mM DTT) containing greater than a 10X molar excess ofdT₂₅-biotin-streptavidin agarose whose dT₂₅ concentration was ˜10 μM(volume of slurry equal or greater than the volume of lysate) andincubated with agitation at 4° C. for one hour. The agarose was thenremoved from the mixture either by filtration (Millipore ultrafree MCfilters) or centrifugation and washed with cold isolation buffer 2-4times. The template was then liberated from the dT₂₅ streptavidinagarose by repeated washing with 50-100 μl aliquots of 15 mM NaOH, 1 mMEDTA. The eluent was immediately neutralized in 3M NaOAc pH 5.2, 10 mMspermidine, and was ethanol precipitated. For the pool reaction, thetotal radioactivity recovered indicated approximately 50-70% of theinput template was recovered.

Isolation of Fusion with Thiopropyl Sepharose. Fusions containingcysteine can be purified using thiopropyl sepharose 6B as in FIG. 13(Pharmacia). In the experiments described herein, isolation was eithercarried out directly from the translation reaction or following initialisolation of the fusion (e.g., with streptavidin agarose). For samplespurified directly, a ratio of 1:10 (v/v) lysate to sepharose was used.For the pool, 0.5 ml of sepharose slurry was used to isolate all of thefusion material from 5 ml of reaction mixture. Samples were diluted intoa 50:50 (v/v) slurry of thiopropyl sepharose in 1X TE 8.2 (10 mMTris-Cl, 1 mM EDTA, pH 8.2) containing DNase free RNase (BoehringerMannheim) and incubated with rotation for 1-2 hours at 4° C. to allowcomplete reaction. The excess liquid was removed, and the sepharose waswashed repeatedly with isolation buffer containing 20 mM DTT andrecovered by centrifugation or filtration. The fusions were eluted fromthe sepharose using a solution of 25-30 mM dithiothreitol (DTT) in 10 mMTris chloride pH 8.2, 1 mM EDTA. The fusion was then concentrated by acombination of evaporation under high vacuum and ethanol precipitationas described above. For the pool reaction, the total radioactivityrecovered indicated approximately 1% of the template was converted tofusion.

For certain applications, dT₂₅ was added to this eluate and rotated for1 hour at 4° C. The agarose was rinsed three times with cold isolationbuffer, isolated via filtration, and the bound material eluted as above.Carrier tRNA was added, and the fusion product was ethanol precipitated.The sample was resuspended in TE pH 8.2 containing DNase free RNase A toremove the RNA portion of the template.

Immunoprecipitation Reactions. Immunoprecipitations of peptides fromtranslation reactions (FIG. 10) were performed by mixing 4 μl ofreticulocyte translation reaction, 2 μl normal mouse sera, and 20 μlProtein G+A agarose (Calbiochem, La Jolla, Calif.) with 200 μl of eitherPBS (58 mM Na₂HPO₄, 17 mM NaH₂PO₄, 68 mM NaCl), dilution buffer (10 mMTris chloride pH 8.2, 140 mM NaCl, 10 1% v/v Triton X-100), or PBSTDS(PBS+1% Triton X-100, 0.5% deoxycholate 0.1% SDS). Samples were thenrotated for one hour at 4° C., followed by centrifugation at 2500 rpmfor 15 minutes. The eluent was removed, and 10 μl of c-myc monoclonalantibody 9E10 (Calbiochem, La Jolla, Calif.) and 15 l of Protein G+Aagarose was added and rotated for 2 hours at 4° C. Samples were thenwashed with two 1 ml volumes of either PBS, dilution buffer, or PBSTDS.40 μl of gel loading buffer (Calbiochem Product Bulletin) was added tothe mixture, and 20 μl was loaded on a denaturing PAGE as described bySchagger and von Jagow (Anal. Biochem. 166:368 (1987)).

Immunoprecipitations of fusions (as shown in FIG. 11) were performed bymixing 8 μl of reticulocyte translation reaction with 300 μl of dilutionbuffer (10 mM Tris chloride pH 8.2, 140 mM NaCl, 1% v/v Triton X-100),15 μl protein G sepharose (Sigma), and 10 μl (1 μg) c-myc antibody 9E10(Calbiochem), followed by rotation for several hours at 4° C. Afterisolation, samples were washed, treated with DNase free RNase A, labeledwith polynucleotide kinase and 32p gamma ATP, and separated bydenaturing urea PAGE (FIG. 11).

Reverse Transcription of Fusion Pool. Reverse transcription reactionswere performed according to the manufacturers recommendation forSuperscript II, except that the template, water, and primer wereincubated at 70° C. for only two minutes (Gibco BRL, Grand Island,N.Y.). To monitor extension, 50 μCi alpha ³²P dCTP was included in somereactions; in other reactions, reverse transcription was monitored using5′ ³²P-labeled primers which were prepared using ³²P αATP (New EnglandNuclear, Boston, Msss.) and T4 polynucleotide kinase (New EnglandBiolabs, Beverly, Mass.).

Preparation of Protein G and Antibody Sepharose. Two aliquots of 50 μlProtein G sepharose slurry (50% solid by volume) (Sigma) were washedwith dilution buffer (10 mM Tris chloride pH 8.2,140 mM NaCl, 0.025%NaN₃, 1% v/v Triton X-100) and isolated by centrifugation. The firstaliquot was reserved for use as a precolumn prior to the selectionmatrix. After resuspension of the second aliquot in dilution buffer, 40μg of c-myc AB-1 monoclonal antibody (Oncogene Science) was added, andthe reaction incubated overnight at 4° C. with rotation. The antibodysepharose was then purified by centrifugation for 15 minutes at1500-2500 rpm in a microcentrifuge and washed 1-2 times with dilutionbuffer.

Selection. After isolation of the fusion and complementary strandsynthesis, the entire reverse transcriptase reaction was used directlyin the selection process. Two protocols are outlined here. For roundone, the reverse transcriptase reaction was added directly to theantibody sepharose prepared as described above and incubated 2 hours.For subsequent rounds, the reaction is incubated ˜2 hours with washedprotein G sepharose prior to the antibody column to decrease the numberof binders that interact with protein G rather than the immobilizedantibody.

To elute the pool from the matrix, several approaches may be taken. Thefirst is washing the selection matrix with 4% acetic acid. Thisprocedure liberates the peptide from the matrix. Alternatively, a morestringent washing (e.g., using urea or another denaturant) may be usedinstead or in addition to the acetic acid approach.

PCR of Selected Fusions. Selected molecules are amplified by PCR usingstandard protocols as described above for construction of the pool.

SYNTHESIS AND TESTING OF BETA-GLOBIN FUSIONS

To synthesize a β-globin fusion construct, β-globin cDNA was generatedfrom 2.5 μg globin mRNA by reverse transcription with 200 pmoles ofprimer 18.155 (5′ GTG GTA TTT GTG AGC CAG) (SEQ ID NO: 29) andSuperscript reverse transcriptase (Gibco BRL) according to themanufacturer's protocol. The primer sequence was complementary to the 18nucleotides of β-globin 5′ of the stop codon. To add a T7 promoter, 20μI of the reverse transcription reaction was removed and subjected to 6cycles of PCR with primers 18.155 and 40.54 (5′ TAA TAC GAC TCA CTA TAGGGA CAC TTG CTT TTG ACA CAA C) (SEQ ID NO:30). The resulting“syn-β-globin” mRNA was then generated by T7 runoff transcriptionaccording to Milligan and Uhlenbeck (Methods Enzymol. 180:51 (1989)),and the RNA gel purified, electroeluted, and desalted as describedherein. “LP-β-globin” was then generated from the syn-β-globin constructby ligation of that construct to 30-P according to the method of Mooreand Sharp (Science 256:992 (1992)) using primer 20.262 (5′ TTT TTT TTT TGTG GTA TTT G) (SEQ ID NO:31) as the splint. The product of the ligationreaction was then gel purified, electroeluted, and desalted as above.The concentration of the final product was determined by absorbance at260 nm.

These β-globin templates were then translated in vitro as described inTable 1 in a total volume of 25 μl each. Mg²⁺ was added from a 25 mMstock solution. All reactions were incubated at 30° C. for one hour andplaced at −20° C. overnight. dT₂₅ precipitable CPM's were thendetermined twice using 6 μl of lysate and averaged minus background.

TABLE 1 Translation Reactions with Beta-Globin Templates Mg²⁻ ³⁵S MetTCA CPM dT₂₅ CPM Reaction Template (mM) (μl) (2 μl) (6 μl) 1 — 1.0 2.0(20 μCi) 3312 0 2 2.5 μg 0.5 2.0 (20 μCi) 33860 36 syn-β-globin 3 2.5 μg1.0 2.0 (20 μCi) 22470 82 syn-β-globin 4 2.5 μg 2.0 2.0 (20 μCi) 1569686 syn-β-globin 5 2.5 μg 0.5 2.0 (20 μCi) 32712 218 LP-β-globin 6 2.5 μg1.0 2.0 (20 μCi) 24226 402 LP-β-globin 7 2.5 μg 2.0 2.0 (20 μCi) 15074270 LP-β-globin

To prepare the samples for gel analysis, 6 μl of each translationreaction was mixed with 1000 μl of Isolation Buffer (1M NaCl, 100 mMTris-Cl pH 8.2, 10 mM EDTA, 0.1 mM DTT), 1 μl RNase A (DNase Free,Boehringer Mannheim), and 20 μl of 20 μM dT₂₅ streptavidin agarose.Samples were incubated at 4° C. for one hour with rotation. ExcessIsolation Buffer was removed, and the samples were added to a MilliporeMC filter to remove any remaining Isolation Buffer. Samples were thenwashed four times with 50 μl of H₂O, and twice with 50 μl of 15 mM NaOH,1 mM EDTA. The sample (300 μl) was neutralized with 100 μl TE pH 6.8 (10mM Tris-Cl, 1 mM EDTA), 1 μ1 of 1 mg/ml RNase A (as above) was added,and the samples were incubated at 37° C. 10 μl of 2X SDS loading buffer(125 mM Tris-Cl pH 6.8, 2% SDS, 2% β-mercaptoethanol 20% glycerol,0.001% bromphenol blue) was then added, and the sample was lyophilizedto dryness and resuspended in 20 μl H₂O and 1% β-mercaptoethanol.Samples were then loaded onto a peptide resolving gel as described bySchagger and von Jagow (Analytical Biochemistry 166:368 (1987)) andvisualized by autoradiography.

The results of these experiments are shown in FIGS. 15A and 15B. Asindicated in FIG. 15A, ³⁵S-methionine was incorporated into the proteinportion of the syn-β-globin and LP-β-globin fusions. The protein washeterogeneous, but one strong band exhibited the mobility expected forβ-globin mRNA. Also, as shown in FIG. 15B, after dT₂₅ isolation andRNase A digestion, no ³⁵S-labeled material remained in the syn-p-globinlanes (FIG. 15B, lanes 2-4). In contrast, in the LP-β-globin lanes, ahomogeneously sized ³⁵S-labeled product was observed.

These results indicated that, as above, a fusion product was isolated byoligonucleotide affinity chromatography only when the template containeda 3′ puromycin. This was confirmed by scintillation counting (see Table1). The material obtained is expected to contain the 30-P linker fusedto some portion of β-globin. The fusion product appeared quitehomogeneous in size as judged by gel analysis. However, since theproduct exhibited a mobility very similar to natural β-globin (FIGS. 15Aand 15B, control lanes), it was difficult to determine the preciselength of the protein portion of the fusion product.

FURTHER OPTIMIZATION OF RNA-PROTEIN FUSION FORMATION

Certain factors have been found to further increase the efficiency offormation of RNA-peptide fusions. Fusion formation, i.e., the transferof the nascent peptide chain from its tRNA to the puromycin moiety atthe 3′ end of the mRNA, is a slow reaction that follows the initial,relatively rapid translation of the open reading frame to generate thenascent peptide. The extent of fusion formation may be substantiallyenhanced by a post-translational incubation in elevated Mg²⁺ conditions(preferably, in a range of 50-100 mM) and/or by the use of a moreflexible linker between the mRNA and the puromycin moiety. In addition,long incubations (12-48 hours) at low temperatures (preferably, −20° C.)also result in increased yields of fusions with less mRNA degradationthan that which occurs during incubation at 30° C. By combining thesefactors, up to 40% of the input mRNA may be converted to mRNA-peptidefusion products, as shown below.

Synthesis of mRNA-Puromycin Conjugates. In these optimizationexperiments, puromycin-containing linker oligonucleotides were ligatedto the 3′ ends of mRNAs using bacteriophage T4 DNA ligase in thepresence of complementary DNA splints, generally as described above.Since T4 DNA ligase prefers precise base-pairing near the ligationjunction and run-off transcription products with T7, T3, or SP6 RNApolymerase are often heterogeneous at their 3′ ends (Nucleic AcidsResearch 15:8783 (1987)), only those RNAs containing the correct3′-terminal nucleotide were efficiently ligated. When a standard DNAsplint was used, approximately 40% of runoff transcription products wereligated to the puromycin oligo. The amount of ligation product wasincreased by using excess RNA, but was not increased using excesspuromycin oligonucleotide. Without being bound to a particular theory,it appeared that the limiting factor for ligation was the amount of RNAwhich was fully complementary to the corresponding region of the DNAsplint.

To allow ligation of those transcripts ending with an extranon-templated nucleotide at the 3′ terminus (termed “N+1 products”), amixture of the standard DNA splint with a new DNA splint containing anadditional random base at the ligation junction was used. The ligationefficiency increased to more than 70% for an exemplary myc RNA template(that is, RNA124) in the presence of such a mixed DNA splint.

In addition to this modified DNA splint approach, the efficiency ofmRNA-puromycin conjugate formation was also further optimized by takinginto account the following three factors. First, mRNAs were preferablydesigned or utilized which lacked 3′-termini having any significant,stable secondary structure that would interfere with annealing to asplint oligonucleotide. In addition, because a high concentration ofsalt sometimes caused failure of the ligation reaction, thoroughdesalting of the oligonucleotides using NAP-25 columns was preferablyincluded as a step in the procedure. Finally, because the ligationreaction was relatively rapid and was generally complete within 40minutes at room temperature, significantly longer incubation periodswere not generally utilized and often resulted in unnecessarydegradation of the RNA.

Using the above conditions, mRNA-puromycin conjugates were synthesizedas follows. Ligation of the myc RNA sequence (RNA124) to thepuromycin-containing oligonucleotide was performed using either astandard DNA splint (e.g., 5′-TTTTTTTTTTAGCGCAAGA) (SEQ ID NO:28) or asplint containing a random base (N) at the ligation junction (e.g.,5′-TTTTTTTTTTNAGCGCAAGA) (SEQ ID NO:33). The reactions consisted ofmRNA, the DNA splint, and the puromycin oligonucleotide in a molar ratioof 1.0: 1.5-2.0: 1.0. A mixture of these components was first heated at94° C. for 1 minute and then cooled on ice for 15 minutes. Ligationreactions were performed for one hour at room temperature in 50 mMTris-HCl (pH 7.5), 10 MM MgCl₂, 10 mM DTT, 1 mM ATP, 25 μg/ml BSA, 15 μMpuromycin oligo, 15 μM mRNA, 22.5-30 μM DNA splint, RNasin inhibitor(Promega) at 1 U/μl, and 1.6 units of T4 DNA ligase per picomole ofpuromycin oligo. Following incubation, EDTA was added to a finalconcentration of 30 mM, and the reaction mixtures were extracted withphenol/chloroform. Full length conjugates were purified by denaturingPAGE and isolated by electroelution.

General Reticulocyte Translation Conditions. In addition to improvingthe synthesis of the mRNA-puromycin conjugate, translation reactionswere also further optimized as follows. Reactions were performed inrabbit reticulocyte lysates from different commercial sources (Novagen,Madison, Wis.; Amersham, Arlington Heights, Ill.; Boehringer Mannheim,Indianapolis, Ind.; Ambion, Austin, Tex.; and Promega, Madison, Wis.). Atypical reaction mixture (25 μl final volume) consisted of 20 mM HEPESpH 7.6, 2 mM DTT, 8 mM creatine phosphate, 100 mM KCl, 0.75 mM Mg(OAc)₂,1 mM ATP, 0.2 mM GTP, 25 μM of each amino acid (0.7 μM methionine if³⁵S-Met was used), RNasin at 1 U/μI, and 60% (v/v) lysate. The finalconcentration of template was in the range of 50 nM to 800 nM. For eachincubation, all components except lysate were mixed carefully on ice,and the frozen lysate was thawed immediately before use. After additionof lysate, the reaction mixture was mixed thoroughly by gentle pipettingand incubated at 30° C. to start translation. The optimal concentrationsof Mg²⁺ and K⁺ varied within the ranges of 0.25 mM-2 mM and 75 mM-200mM, respectively, for different mRNAs and was preferably determined inpreliminary experiments. Particularly for poorly translated mRNAs, theconcentrations of hemin, creatine phosphate, tRNA, and amino acids werealso sometimes optimized. Potassium chloride was generally preferredover potassium acetate for fusion reactions, but a mixture of KCl andKOAc sometimes produced better results.

After translation at 30° C. for 30 to 90 minutes, the reaction wascooled on ice for 40 minutes, and Mg²⁺ was added. The finalconcentration of Mg²⁺ added at this step was also optimized fordifferent mRNA templates, but was generally in the range of 50 mM to 100mM (with 50 mM being preferably used for pools of mixed templates). Theresulting mixture was incubated at −20° C. for 16 to 48 hours. Tovisualize the labeled fusion products, 2 μl of the reaction mixture wasmixed with 4 μl loading buffer, and the mixture was heated at 75° C. for3 minutes. The resulting mixture was then loaded onto a 6% glycineSDS-polyacrylamide gel (for ³²P-labeled templates) or an 8% tricineSDS-polyacrylamide gel (for ³⁵S-Met-labeled templates). As analternative to this approach, the fusion products may also be isolatedusing dT₂₅ streptavidin agarose or thiopropyl sepharose (or both),generally as described herein.

To remove the RNA portion of the RNA-linker-puromycin-peptide conjugatefor subsequent analysis by SDS-PAGE, an appropriate amount of EDTA wasadded after post-translational incubation, and the reaction mixture wasdesalted using a microcon-10 (or microcon-30) column. 2 μl of theresulting mixture (approximately 25 μl total) was mixed with 18 μl ofRNase H buffer (30 mM Tris-HCl, pH 7.8, 30 mM (NH₄)₂SO₄, 8 mM MgCl₂, 1.5mM β-mercaptoethanol, and an appropriate amount of complementary DNAsplint), and the mixture was incubated at 4° C. for 45 minutes. RNase Hwas then added, and digestion was performed at 37° C. for 20 minutes.

Quality of Puromycin Oligo. The quality of the puromycin oligonucleotidewas also important for the efficient generation of fusion products. Thecoupling of 5′-DMT, 2′-succinyl, N-trifluoroacetyl puromycin with CPGwas not as efficient as the coupling of the standard nucleotides. Assuch, the coupling reaction was carefully monitored to avoid theformation of CPG with too low a concentration of coupled puromycin, andunreacted amino groups on the CPG were fully quenched to avoidsubsequent synthesis of oligonucleotides lacking a 3′-terminalpuromycin. It was also important to avoid the use of CPG containing veryfine mesh particles, as these were capable of causing problems withvalve clogging during subsequent automated oligonucleotide synthesissteps.

In addition, the synthesized puromycin oligo was preferably testedbefore large scale use to ensure the presence of puromycin at the 3′end. In our experiments, no fusion was detected if puromycin wassubstituted with a deoxyadenosine containing a primary amino group atthe 3′ end. To test for the presence of 3′ hydroxyl groups (i.e., theundesired synthesis of oligos lacking a 3′-terminal puromycin), thepuromycin oligo may first be radiolabeled (e.g., by 5′-phosphorylation)and then used as a primer for extension with terminal deoxynucleotidyltransferase. In the presence of a 3′-terminal puromycin moiety, noextension product should be observed.

Time Course of Translation and Post-Translational Incubation. Thetranslation reaction was relatively rapid and was generally completedwithin 25 minutes at 30° C. The fusion reaction, however, was slower.When a standard linker (dA₂₇dCdCP) was used at 30° C., fusion synthesisreached its maximum level in an additional 45 minutes. Thepost-translational incubation could be carried out at lowertemperatures, for example, room temperature, 0° C., or −20° C. Lessdegradation of the mRNA template was observed at −20° C., and the bestfusion results were obtained after incubation at −20° C. for 2 days.

The Effect of Mg²⁺ Concentration. A high concentration of Mg²⁺ in thepost-translational incubation greatly stimulated fusion formation. Forexample, for the myc RNA template described above, a 3-4 foldstimulation of fusion formation was observed using a standard linker(dA₂₇dCdCP) in the presence of 50 mM Mg²⁺ during the 16 hour incubationat −20° C. (FIG. 17, compare lanes 3 and 4). Similarly, efficient fusionformation was also observed using a post-translational incubation in thepresence of a 50-100 mM Mg²⁺ concentration when the reactions werecarried out at room temperature for 30-45 minutes.

Linker Length and Sequence. The dependence of the fusion reaction on thelength of the linker was also examined. In the range between 21 and 30nucleotides (n=18-27), little change was seen in the efficiency of thefusion reaction (as described above). Shorter linkers (e.g., 13nucleotides in length) resulted in lower fusion. In addition, althoughparticular linkers of greater length (that is, of 45 nucleotides and 54nucleotides) also resulted in somewhat lower fusion efficiences, itremains likely that yet longer linkers may also be used to optimize theefficiency of the fusion reaction.

With respect to linker sequence, substitution of deoxyribonucleotideresidues near the 3′ end with ribonucleotide residues did notsignificantly change the fusion efficiency. The dCdCP (or rCrCP)sequence at the 3′ end of the linker was, however, important to fusionformation. Substitution of dCdCP with dUdUP reduced the efficiency offusion formation significantly.

Linker Flexibility. The dependence of the fusion reaction on theflexibility of the linker was also tested. In these experiments, it wasdetermined that the fusion efficiency was low if the rigidity of thelinker was increased by annealing with a complementary oligonucleotidenear the 3′ end. Similarly, when a more flexible linker (for example,dA21C₉C₉C₉dAdCdCP, where Cg represents HO(CH₂CH₂O)₃PO₂) was used, thefusion efficiency was significantly improved. Compared to the standardlinker (dA₂₇dCdCP), use of the more flexible linker (dA₂₁C₉C₉C₉dAdCdCP)improved the fusion efficiency for RNA124 more than 4-fold (FIG. 17,compare lanes 1 and 9). In addition, in contrast to the template withthe standard linker whose post-translation fusion proceeded poorly inthe absence of a high concentration of Mg²⁺ (FIG. 17, lane 3 and 4), thetemplate with the flexible linker did not require elevated Mg²⁺ toproduce a good yield of fusion product in an extended post-translationalincubation at −20° C. (FIG. 17, compare lanes 11 and 12). This linker,therefore, was very usefull if post-translational additions of highconcentrations of Mg²⁺ were not desired. In addition, the flexiblelinker also produced optimal fusion yields in the presence of elevatedMg²⁺.

Quantitation of Fusion Efficiency. Fusion efficiency may be expressed aseither the fraction of translated peptide converted to fusion product,or the fraction of input template converted to fusion product. Todetermine the fraction of translated peptide converted to fusionproduct, ³⁵S-Met labeling of the translated peptide was utilized. Inthese experiments, when a dA₂₇dCdCP or dA₂₇rCrCP linker was used, about3.5% of the translated peptide was fused to its mRNA after a 1 hourtranslation incubation at 30° C. This value increased to 12% afterovernight incubation at −20° C. When the post-translational incubationwas carried out in the presence of a high concentration of Mg²⁺, morethan 50% of the translated peptide was fused to the template.

For a template with a flexible linker, approximately 25% of thetranslated peptide was fused to the template after 1 hour of translationat 30° C. This value increased to over 50% after overnight incubation at−20° C. and to more than 75% if the post-translational incubation wasperformed in the presence of 50 mM Mg²⁺.

To determine the percentage of the input template converted to fusionproduct, the translations were performed using ³²P-labeled mRNA-linkertemplate. When the flexible linker was used and post-translationalincubation was performed at −20° C. without addition of Mg²⁺, about 20%,40%, 40%, 35%, and 20% of the input template was converted tomRNA-peptide fusion when the concentration of the input RNA template was800, 400, 200, 100, and 50 nM, respectively (FIG. 18). Similar resultswere obtained when the post-translational incubation was performed inthe presence of 50 mM Mg²⁺. The best results were achieved using lysatesobtained from Novagen, Amersham, or Ambion (FIG. 19).

The mobility differences between mRNAs and mRNA-peptide fusions asmeasured by SDS-PAGE may be very small if the mRNA template is long. Insuch cases, the template may be labeled at the 5′ end of the linker with³²P. The long RNA portion may then be digested with RNase H in thepresence of a complementary DNA splint after translation/incubation, andthe fusion efficiency determined by quantitation of the ratio ofunmodified linker to linker-peptide fusion. Compared to RNase Adigestion, which produces 3′-P and 5′-OH, this approach has theadvantage that the ³²P at the 5′ end of the linker is not removed.

Intramolecular vs. Intermolecular Fusion During Post-TranslationalIncubation. In addition to the above experiments, we tested whether thefusion reaction that occurred at −20° C. in the presence of Mg²⁺ wasintra- or intermolecular in nature. Free linker (dA₂₇dCdCP ordA₂₁C₉C₉C₉dAdCdCP, where C₉ is —O(CH₂CH₂O)₃PO₂—) was coincubated with atemplate containing a DNA linker, but without puromycin at the 3′ end,under the translation and post-translational incubation conditionsdescribed above. In these experiments, no detectable amount (that isless than 2% of the normal level) of ³⁵S-Met was incorporated intolinker-peptide product, suggesting that post-translational fusionoccurred primarily between the nascent peptide and the mRNA bound to thesame ribosome.

Optimization Results. As illustrated above, by using the flexible linkerand/or performing the post-translational incubation in the presence of ahigh concentration of Mg²⁺, fusion efficiencies were increased toapproximately 40% of input mRNA. These results indicated that as many as10¹⁴ molecules of mRNA-peptide fusion could be generated per ml of invitro translation reaction mix, producing pools of mRNA-peptide fusionsof very high complexity for use in in vitro selection experiments.

SELECTIVE ENRICHMENT OF RNA-PROTEIN FUSIONS

We have demonstrated the feasibility of using RNA-peptide fusions inselection and evolution experiments by enriching a particularRNA-peptide fusion from a complex pool of random sequence fusions on thebasis of the encoded peptide. In particular, we prepared a series ofmixtures in which a small quantity of known sequence (in this case, thelong myc template, LP154) was combined with some amount of randomsequence pool (that is, LP 160). These mixtures were translated, and theRNA-peptide fusion products selected by oligonucleotide and disulfideaffinity chromatography as described herein. The myc-template fusionswere selectively immunoprecipitated with anti-myc monoclonal antibody(FIG. 16A). To measure the enrichment obtained in this selective step,aliquots of the mixture of cDNA/mRNA-peptide fusions from before andafter the immunoprecipitation were amplified by PCR in the presence of aradiolabeled primer. The amplified DNA was digested with a restrictionendonuclease that cut the myc template sequence but not the pool (FIGS.16B and 16C). Quantitation of the ratio of cut and uncut DNA indicatedthat the myc sequence was enriched by 20-40 fold relative to the randomlibrary by immunoprecipitation.

These experiments were carried out as follows.

Translation Reactions. Translation reactions were performed generally asdescribed above. Specifically, reactions were performed at 30° C. forone hour according to the manufacturer's specifications (Novagen) andfrozen overnight at −20° C. Two versions of six samples were made, onecontaining ³⁵S methionine and one containing cold methionine added to afinal concentration of 52 μM. Reactions 1-6 contained the amounts oftemplates described in Table 2. All numbers in Table 2 representpicomoles of template per 25 μl reaction mixture.

TABLE 2 Template Ratios Used in Doped Selection Reaction LP154 LP160 1 —— 2 5 — 3 1 20 4 0.1 20 5 0.01 20 6 — 20

Preparation of dT₂₅ Streptavidin Agarose. Streptavidin agarose (Pierce)was washed three times with TE 8.2 (10 mM Tris-Cl pH 8.2, 1 mM EDTA) andresuspended as a 1:1 (v/v) slurry in TE 8.2. 3′ biotinyl T₂₅ synthesizedusing Bioteg CPG (Glen Research) was then added to the desired finalconcentration (generally 10 or 20 μM), and incubation was carried outwith agitation for 1 hour. The dT₂₅ streptavidin agarose was then washedthree times with TE 8.2 and stored at 4° C. until use.

Purification of Templates from Translation Reactions. To purifytemplates from translation reactions, 25 μl of each reaction was removedand added to 7.5 ml of Isolation Buffer (1M NaCl, 100 mM Tris-Cl pH 8.2,10 mM EDTA, 0.1 mM DTT) and 125 μl of 20 μM dT₂₅ streptavidin agarose.This solution was incubated at 4° C. for one hour with rotation. Thetubes were centrifuged and the eluent removed. One ml of IsolationBuffer was added, the slurry was resuspended, and the mixtures weretransferred to 1.5 ml microcentrifuge tubes. The samples were thenwashed four times with 1 ml aliquots of ice cold Isolation Buffer. Hotand cold samples from identical reactions were then combined in a filterMillpore MC filter unit and were eluted from the dT₂₅ agarose by washingwith 2 volumes of 100 μl H₂O, 0.1 mM DTT, and 2 volumes of 15 mM NaOH, 1mM EDTA.

To this eluent was added 40 μl of a 50% slurry of washed thiopropylsepharose (Pharmacia), and incubation was carried out at 4° C. withrotation for 1 hour. The samples were then washed with three 1 mlvolumes of TE 8.2 and the eluent removed. One μl of 1M DTT was added tothe solid (total volume approximately 20-30 μl), and the sample wasincubated for several hours, removed, and washed four times with 20 μlH₂O (total volume 90 μl). The eluent contained 2.5 mM thiopyridone asjudged by UV absorbance. 50 μl of this sample was ethanol precipitatedby adding 6 μl 3M NaOAc pH 5.2, 10 mM spermine, 1 μl glycogen (10 mg/ml,Boehringer Mannheim), and 170 μl 100% EtOH, incubating for 30 minutes at−70° C., and centrifuging for 30 minutes at 13,000 rpm in amicrocentrifuge.

Reverse Transcriptase Reactions. Reverse transcription reactions wereperformed on both the ethanol precipitated and the thiopyridone eluentsamples as follows. For the ethanol precipitated samples, 30 μl ofresuspended template, H₂O to 48 μl, and 200 picomoles of primer 21.103(SEQ ID NO:22) were annealed at 70° C. for 5 minutes and cooled on ice.To this sample, 16 μl of first strand buffer (250 mM Tris-Cl pH 8.3, 375mM KC1, and 15 mM MgCl₂; available from Gibco BRL, Grand Island, N.Y.),8 μl 100 mM DTT, and 4 μl 10 mM NTP were added and equilibrated at 42°C., and 4 μl Superscript II reverse transcriptase (Gibco BRL, GrandIsland, N.Y.) was added. H₂O (13 μl) was added to the TP sepharoseeluent (35 μl), and reactions were performed as above. After incubationfor one hour, like numbered samples were combined (total volume 160 μl).10 μl of sample was reserved for the PCR of each unselected sample, and150 μl of sample was reserved for immunoprecipitation.

Immunoprecipitation. To carry out immunoprecipitations, 170 μI ofreverse transcription reaction was added to 1 ml of Dilution Buffer (10mM Tris-Cl, pH 8.2, 140 mM NaCl, 1% v/v Triton X-100) and 20 μl ofProtein G/A conjugate (Calbiochem, La Jolla, Calif.), and precleared byincubation at 4° C. with rotation for 1 hour. The eluent was removed,and 20 μl G/A conjugate and 20 μl of monoclonal antibody (2 μg, 12picomoles) were added, and the sample incubated with rotation for twohours at 4° C. The conjugate was precipitated by microcentrifugation at2500 rpm for 5 minutes, the eluent removed, and the conjugate washedthree times with 1 ml aliquots of ice cold Dilution Buffer. The samplewas then washed with 1 ml ice cold 10 mM Tris-Cl, pH 8.2, 100 mM NaCl.The bound fragments were removed using 3 volumes of frozen 4% HOAc, andthe samples were lyophilized to dryness.

PCR of Selected and Unselected Samples. PCR reactions were carried outby adding 20 μl of concentrated NH₄OH to 10 μl of the unselectedmaterial and the entirety of the selected material and incubating for 5minutes each at 55 ° C., 70° C., and 90° C. to destroy any RNA presentin the sample. The samples were then evaporated to dryness using aspeedvac. 200 μl of PCR mixture (1 μM primers 21.103 and 42.108, 200 μMdNTP in PCR buffer plus Mg²⁺ (Boehringer Mannheim), and 2 μl of Taqpolymerase (Boehringer Mannheim)) were added to each sample. 16 cyclesof PCR were performed on unselected sample number 2, and 19 cycles wereperformed on all other samples.

Samples were then amplified in the presence of 5 ³²P-labeled primer21.103 according to Table 3, and purified twice individually usingWizard direct PCR purification kite (Promega) to remove all primer andshorter fragments.

TABLE 3 Amplification of Selected and Unselected PCR Samples Sample TypeVolume Cycles 1 unselected 20 μl 5 2 unselected  5 μl 4 3 unselected 20μl 5 4 unselected 20 μl 5 5 unselected 20 μl 5 6 unselected 20 μl 5 1selected 20 μl 5 2 selected  5 μl 4 3 selected 20 μl 5 4 selected 20 μl7 5 selected 20 μl 7 6 selected 20 μl 7

Restriction Digests. ³²P labeled DNA prepared from each of the above PCRreactions was added in equal amounts (by cpm of sample) to restrictiondigest reactions according to Table 4. The total volume of each reactionwas 25 μl. 0.5 μl of AlwnI (5 units, New England Biolabs) was added toeach reaction. Samples were incubated at 37° C. for 1 hour, and theenzyme was heat inactivated by a 20 minute incubation at 65° C. Thesamples were then mixed with 10 μl denaturing loading buffer (1 mlultrapure formamide (USB), 20 μl 0.5M EDTA, and 20 μl 1M NaOH), heatedto 90° C. for 1 minute, cooled, and loaded onto a 12% denaturingpolyacrylamide gel containing 8M urea. Following electrophoresis, thegel was fixed with 10% (v/v) HOAc, 10% (v/v)

TABLE 4 Restriction Digest Conditions w/AlwnI Volume DNA Sample Typeadded to reaction Total volume 1 unselected 20 μl 25 μl 2 unselected  4μl 25 μl 3 unselected 20 μl 25 μl 4 unselected 20 μl 25 μl 5 unselected 4 μl 25 μl 6 unselected 20 μl 25 μl 1 selected 20 μl 25 μl 2 selected 8 μl 25 μl 3 selected 12 μl 25 μl 4 selected 12 μl 25 μl 5 selected 20μl 25 μl 6 selected 20 μl 25 μl

Quantitation of Digest. The amount of myc versus pool DNA present in asample was quantitated using a phosphorimager (Molecular Dynamics). Theamount of material present in each band was determined as the integratedvolume of identical reactangles drawn around the gel bands. The totalcpm present in each band was calculated as the volume minus thebackground. Three values of background were used: (1) an average ofidentical squares outside the area where counts occurred on the gel; (2)the cpm present in the unselected pool lane where the myc band shouldappear (no band appears at this position on the gel); and (3) anormalized value that reproduced the closest value to the 10-foldtemplate increments between unselected lanes. Lanes 2, 3, and 4 of FIGS.16B and 16C demonstrate enrichment of the target versus the poolsequence. The demonstrable enrichment in lane 3 (unselected/selected)yielded the largest values (17, 43, and 27 fold using methods 1-3,respectively) due to the optimization of the signal to noise ratio forthis sample. These results are summarized in Table 5.

TABLE 5 Enrichment of Myc Template vs. Pool Method Lane 2 (20) Lane 3(200) Lane 4 (2000) 1 7.0 16.6 5.7 2 10.4 43 39 3 8.7 27 10.2

In a second set of experiments, these same PCR products were purifiedonce using Wizard direct PCR purification kits, and digests werequantitated by method (2) above. In these experiments, similar resultswere obtained; enrichments of 10.7, 38, and 12 fold, respectively, weremeasured for samples equivalent to those in lanes 2, 3, and 4 above.

USE OF PROTEIN SELECTION SYSTEMS

The selection systems of the present invention have commercialapplications in any area where protein technology is used to solvetherapeutic, diagnostic, or industrial problems. This selectiontechnology is useful for improving or altering existing proteins as wellas for isolating new proteins with desired functions. These proteins maybe naturally-occurring sequences, may be altered forms ofnaturally-occurring sequences, or may be partly or filly syntheticsequences.

Isolation of Novel Binding Reagents. In one particular application, theRNA-protein fusion technology described herein is useful for theisolation of proteins with specific binding (for example, ligandbinding) properties. Proteins exhibiting highly specific bindinginteractions may be used as non-antibody recognition reagents, allowingRNA-protein fusion technology to circumvent traditional monoclonalantibody technology. Antibody-type reagents isolated by this method maybe used in any area where traditional antibodies are utilized, includingdiagnostic and therapeutic applications.

Improvement of Human Antibodies. The present invention may also be usedto improve human or humanized antibodies for the treatment of any of anumber of diseases. In this application, antibody libraries aredeveloped and are screened in vitro, eliminating the need for techniquessuch as cell-fusion or phage display. In one important application, theinvention is usefull for improving single chain antibody libraries (Wardet al., Nature 341:544 (1989); and Goulot et al., J. Mol. Biol. 213:617(1990)). For this application, the variable region may be constructedeither from a human source (to minimize possible adverse immunereactions of the recipient) or may contain a totally randomized cassette(to maximize the complexity of the library). To screen for improvedantibody molecules, a pool of candidate molecules are tested for bindingto a target molecule (for example, an antigen immobilized as shown inFIG. 2). Higher levels of stringency are then applied to the bindingstep as the selection progresses from one round to the next. To increasestringency, conditions such as number of wash steps, concentration ofexcess competitor, buffer conditions, length of binding reaction time,and choice of immobilization matrix are altered.

Single chain antibodies may be used either directly for therapy orindirectly for the design of standard antibodies. Such antibodies have anumber of potential applications, including the isolation ofanti-autoimmune antibodies, immune suppression, and in the developmentof vaccines for viral diseases such as AIDS.

Isolation of New Catalysts. The present invention may also be used toselect new catalytic proteins. In vitro selection and evolution has beenused previously for the isolation of novel catalytic RNAs and DNAs, and,in the present invention, is used for the isolation of novel proteinenzymes. In one particular example of this approach, a catalyst may beisolated indirectly by selecting for binding to a chemical analog of thecatalyst's transition state. In another particular example, directisolation may be carried out by selecting for covalent bond formationwith a substrate (for example, using a substrate linked to an affinitytag) or by cleavage (for example, by selecting for the ability to breaka specific bond and thereby liberate catalytic members of a library froma solid support).

This approach to the isolation of new catalysts has at least twoimportant advantages over catalytic antibody technology (reviewed inSchultz et al., J. Chem. Engng. News 68:26 (1990)). First, in catalyticantibody technology, the initial pool is generally limited to theimmunoglobulin fold; in contrast, the starting library of RNA-proteinfusions may be either completely random or may consist, withoutlimitation, of variants of known enzymatic structures or proteinscaffolds. In addition, the isolation of catalytic antibodies generallyrelies on an initial selection for binding to transition state reactionanalogs followed by laborious screening for active antibodies; again, incontrast, direct selection for catalysis is possible using anRNA-protein fusion library approach, as previously demonstrated usingRNA libraries. In an alternative approach to isolating protein enzymes,the transition-state-analog and direct selection approaches may becombined.

Enzymes obtained by this method are highly valuable. For example, therecurrently exists a pressing need for novel and effective industrialcatalysts that allow improved chemical processes to be developed. Amajor advantage of the invention is that selections may be carried outin arbitrary conditions and are not limited, for example, to in vivoconditions. The invention therefore facilitates the isolation of novelenzymes or improved variants of existing enzymes that can carry outhighly specific transformations (and thereby minimize the formation ofundesired byproducts) while functioning in predetermined environments,for example, environments of elevated temperature, pressure, or solventconcentration.

An In Vitro Interaction Trap. The RNA-protein fusion technology is alsouseful for screening cDNA libraries and cloning new genes on the basisof protein-protein interactions. By this method, a cDNA library isgenerated from a desired source (for example, by the method of Ausubelet al., supra, chapter 5). To each of the candidate cDNAs, a peptideacceptor (for example, as a puromycin tail) is ligated (for example,using the techniques described above for the generation of LP77, LP154,and LP160). RNA-protein fusions are then generated as described herein,and the ability of these fusions (or improved versions of the fusions)to interact with particular molecules is then tested as described above.If desired, stop codons and 3′ UTR regions may be avoided in thisprocess by either (i) adding suppressor tRNA to allow readthrough of thestop regions, (ii) removing the release factor from the translationreaction by immunoprecipitation, (iii) a combination of (i) and (ii), or(iv) removal of the stop codons and 3′ UTR from the DNA sequences.

The fact that the interaction step takes place in vitro allows carefulcontrol of the reaction stringency, using nonspecific competitor,temperature, and ionic conditions. Alteration of normal small moleculeswith non-hydrolyzable analogs (e.g., ATP vs. ATPgS) provides forselections that discriminate between different conformers of the samemolecule. This approach is useful for both the cloning and functionalidentification of many proteins since the RNA sequence of the selectedbinding partner is covalently attached and may therefore be readilyisolated. In addition, the technique is useful for identifying functionsand interactions of the ˜50-100,000 human genes, whose sequences arecurrently being determined by the Human Genome project.

USE OF RNA-PROTEIN FUSIONS IN A MICROCHIP FORMAT

“DNA chips” consist of spatially defined arrays of immobilizedoligonucleotides or cloned fragments of cDNA or genomic DNA, and haveapplications such as rapid sequencing and transcript profiling. Byannealing a mixture of RNA-protein fusions (for example, generated froma cellular DNA or RNA pool), to such a DNA chip, it is possible togenerate a “protein display chip,” in which each spot corresponding toone immobilized sequence is capable of annealing to its correspondingRNA sequence in the pool of RNA-protein fusions. By this approach, thecorresponding protein is immobilized in a spatially defined mannerbecause of its linkage to its own mRNA, and chips containing sets of DNAsequences display the corresponding set of proteins. Alternatively,peptide fragments of these proteins may be displayed if the fusionlibrary is generated from smaller fragments of cDNAs or genomic DNAs.

Such ordered displays of proteins and peptides have many uses. Forexample, they represent powerful tools for the identification ofpreviously unknown protein-protein interactions. In one specific format,a probe protein is detectably labeled (for example, with a fluorescentdye), and the labeled protein is incubated with a protein display chip.By this approach, the identity of proteins that are able to bind theprobe protein are determined from the location of the spots on the chipthat become labeled due to binding of the probe. Another application isthe rapid determination of proteins that are chemically modified throughthe action of modifying enzymes (for example, protein kinases, acyltransferases, and methyl transferases). By incubating the proteindisplay chip with the enzyme of interest and a radioactively labeledsubstrate, followed by washing and autoradiography, the location andhence the identity of those proteins that are substrates for themodifying enzyme may be readily determined. In addition, the use of thisapproach with ordered displays of small peptides allows the furtherlocalization of such modification sites.

Protein display technology may be carried out using arrays of nucleicacids (including RNA, but preferably DNA) immobilized on any appropriatesolid support. Exemplary solid supports may be made of materials such asglass (e.g., glass plates), silicon or silicon-glass (e.g., microchips),or gold (e.g., gold plates). Methods for attaching nucleic acids toprecise regions on such solid surfaces, e.g., photolithographic methods,are well known in the art, and may be used to generate solid supports(such as DNA chips) for use in the invention. Exemplary methods for thispurpose include, without limitation, Schena et al., Science 270:467-470(1995); Kozal et al., Nature Medicine 2:753-759 (1996); Cheng et al.,Nucleic Acids Research 24:380-385 (1996); Lipshutz et al., BioTechniques19:442-447 (1995); Pease et al., Proc. Natl. Acad. Sci. USA 91:5022-5026(1994); Fodor et al., Nature 364:555-556 (1993); Pirrung et al., U.S.Pat. No. 5,143,854; and Fodor et al., WO 92/10092.

33 1 76 RNA Artificial Sequence Translation template 1 gggaggacgaaauggaacag aaacugaucu cugaagaaga ccugaacaaa aaaaaaaaaa 60 aaaaaaaaaaaaaacc 76 2 10 PRT Homo sapiens 2 Glu Gln Lys Leu Ile Ser Glu Glu AspLeu 1 5 10 3 153 RNA Artificial Sequence Translation template 3gggacaauua cuauuuacaa uuacaauggc ugaagaacag aaacugaucu cugaagaaga 60ccugcugcgu aaacgucgug aacagcugaa acacaaacug gaacagcugc guaacucuug 120cgcuaaaaaa aaaaaaaaaa aaaaaaaaaa acc 153 4 34 PRT Artificial SequenceRandom peptide 4 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa XaaXaa Xaa 1 5 10 15 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gln LeuArg Asn Ser 20 25 30 Cys Ala 5 25 RNA Tobacco Mosaic Virus 5 gggacaauuacuauuuacaa uuaca 25 6 10 RNA Escherichia coli 6 ggaggacgaa 10 7 34 PRTHomo sapiens 7 Met Ala Glu Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu LeuArg Lys 1 5 10 15 Arg Arg Glu Gln Lys Leu Lys His Lys Leu Glu Gln LeuArg Asn Ser 20 25 30 Cys Ala 8 29 DNA Artificial Sequence Translationtemplate 8 aaaaaaaaaa aaaaaaaaaa aaaaaaacc 29 9 12 DNA ArtificialSequence Translation template 9 aaaaaaaaaa cc 12 10 24 DNA ArtificialSequence Translation template 10 cgcggttttt attttttttt ttcc 24 11 42 RNAArtificial Sequence Translation template 11 ggaggacgaa augaaaaaaaaaaaaaaaaa aaaaaaaaaa cc 42 12 42 RNA Artificial Sequence Translationtemplate 12 ggaggacgaa cugaaaaaaa aaaaaaaaaa aaaaaaaaaa cc 42 13 42 RNAArtificial Sequence Translation template 13 ggaggacgaa augaaaaaaaaaaaaaaaaa aaaaaaaaaa cc 42 14 36 RNA Artificial Sequence Translationtemplate 14 ggaggacgaa cugaaaaaaa aaaaaaaaaa aaaacc 36 15 33 RNAArtificial Sequence Translation template 15 ggaggacgaa cugaaaaaaaaaaaaaaaaa acc 33 16 30 RNA Artificial Sequence Translation template 16ggaggacgaa cugaaaaaaa aaaaaaaacc 30 17 159 RNA Artificial SequenceTranslation template 17 gggacaauua cuauuuacaa uuacaaugnn snnsnnsnnsnnsnnsnnsn nsnnsnnsnn 60 snnsnnsnns nnsnnsnnsn nsnnsnnsnn snnsnnsnnsnnsnnsnnsc agcugcguaa 120 cucuugcgcu aaaaaaaaaa aaaaaaaaaa aaaaaaacc 15918 64 DNA Homo sapiens 18 gttcaggtct tcttgagaga tcagtttctg ttccatttcgtcctccctat agtgagtcgt 60 atta 64 19 18 DNA Homo sapiens 19 taatacgactcactatag 18 20 12 PRT Homo sapiens 20 Met Glu Gln Lys Leu Ile Ser GluGlu Asp Leu Asn 1 5 10 21 99 DNA Homo sapiens 21 agcgcaagag ttacgcagctgttccagttt gtgtttcagc tgttcacgac gtttacgcag 60 caggtcttct tcagagatcagtttctgttc ttcagccat 99 22 21 DNA Homo sapiens 22 agcgcaagag ttacgcagctg 21 23 63 DNA Homo sapiens 23 taatacgact cactataggg acaattactatttacaatta caatggctga agaacagaaa 60 ctg 63 24 33 PRT Homo sapiens 24 MetAla Glu Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu Leu Arg Lys 1 5 10 15Arg Arg Glu Gln Leu Lys His Lys Leu Glu Gln Leu Arg Asn Ser Cys 20 25 30Ala 25 49 DNA Artificial Sequence Primers for RNA pool 25 ccctgttaatgataaatgtt aatgttacgt cgacgcattg agataccga 49 26 42 DNA ArtificialSequence Primers for RNA pool 26 taatacgact cactataggg acaattactatttacaatta ca 42 27 21 DNA Artificial Sequence Primers for RNA pool 27agcgcaagag ttacgcagct g 21 28 19 DNA Artificial Sequence DNA splint 28tttttttttt agcgcaaga 19 29 18 DNA Homo sapiens 29 gtggtatttg tgagccag 1830 40 DNA Phage T7 30 taatacgact cactataggg acacttgctt ttgacacaac 40 3120 DNA Artificial Sequence DNA splint 31 tttttttttt gtggtatttg 20 32 124RNA Homo sapiens 32 gggacaauua cuauuuacaa uuacaauggc ugaagaacagaaacugaucu cugaagaaga 60 ccugcugcgu aaacgucgug aacagcugaa acacaaacuggaacagcugc guaacucuug 120 cgcu 124 33 19 DNA Artificial Sequence DNAsplint 33 tttttttttt agcgcaaga 19

What is claimed is:
 1. A molecule comprising a ribonucleic acidcovalently bonded through an amide bond to an antibody, wherein saidantibody is encoded by said ribonucleic acid.
 2. A molecule comprising aribonucleic acid covalently bonded to an antibody, said antibody beingentirely encoded by said ribonucleic acid.
 3. The molecule of claim 1,wherein said amide bond is resistant to cleavage by a ribosome.
 4. Themolecule of any of claims 1, 2, or 3, wherein said antibody is a singlechain antibody.
 5. The molecule of any of claims 1, 2, or 3, whereinsaid antibody is a human antibody.
 6. The molecule of any of claims 1,2, or 3, wherein the variable region of said antibody is derived from ahuman antibody.
 7. The molecule of any of claims 1, 2, or 3, wherein thevariable region of said antibody comprises a randomized region.
 8. Themolecule of claim 1 or 2, wherein said molecule further comprises apeptide acceptor positioned between said ribonucleic acid and saidantibody.
 9. An RNA-antibody fusion selected by the method comprisingthe steps of: a) providing a population of candidate RNA molecules, eachof which comprises a translation initiation sequence and a start codonoperably linked to a candidate antibody coding sequence and each ofwhich is operably linked to a peptide acceptor at the 3′ end of saidcandidate antibody coding sequence; b) in vitro or in situ translatingsaid candidate antibody coding sequences to produce a population ofcandidate RNA-antibody fusions; c) contacting said population ofcandidate RNA-antibody fusions with a target antigen; and d) selectingan RNA-antibody fusion that binds to said target antigen.
 10. AnRNA-antibody fusion selected by the method comprising the steps of: a)providing a population of candidate RNA molecules, each of whichcomprises a translation initiation sequence and a start codon operablylinked to a candidate antibody coding sequence and each of which isoperably linked to a peptide acceptor at the 3′ end of said candidateantibody coding sequence; b) in vitro translating said candidateantibody coding sequences to produce a population of candidateRNA-antibody fusions; c) contacting said population of candidateRNA-antibody fusions with a target antigen; and d) selecting anRNA-antibody fusion that binds to said target antigen.
 11. AnRNA-antibody fusion comprising an antibody portion having an alteredbinding affinity or specificity for a target antigen relative to thebinding affinity or specificity of a reference antibody, saidRNA-antibody fusion selected by the method comprising the steps of: a)producing a population of candidate RNA molecules, each having acandidate antibody coding sequence which differs from said referenceantibody coding sequence, said RNA molecules each comprising atranslation initiation sequence and a start codon operably linked tosaid candidate antibody coding sequence and each being operably linkedto a peptide acceptor at the 3′ end; b) in vitro or in situ translatingsaid candidate antibody coding sequences to produce a population ofcandidate RNA-antibody fusions; c) contacting said population ofcandidate RNA-antibody fusions with said target antigen; and d)selecting an RNA-antibody fusion, the antibody portion of which has abinding affinity or specificity for said target antigen that is alteredrelative to the binding affinity or specificity of said referenceantibody for said target antigen.
 12. An RNA-antibody fusion comprisingan antibody portion having an altered binding affinity or specificityfor a target antigen relative to the binding affinity or specificity ofa reference antibody, said RNA-antibody fusion selected by the methodcomprising the steps of: a) producing a population of candidate RNAmolecules, each having a candidate antibody coding sequence whichdiffers from said reference antibody coding sequence, said RNA moleculeseach comprising a translation initiation sequence and a start codonoperably linked to said candidate antibody coding sequence and eachbeing operably linked to a peptide acceptor at the 3′ end; b) in vitrotranslating said candidate antibody coding sequences to produce apopulation of candidate RNA-antibody fusions; c) contacting saidpopulation of candidate RNA-antibody fusions with said target antigen;and d) selecting an RNA-antibody fusion, the antibody portion of whichhas a binding affinity or specificity for said target antigen that isaltered relative to the binding affinity or specificity of saidreference antibody for said target antigen.